Last updated 2026-05-28
Multi-Channel Fanout
One Notify call, many recipients, many channels per
recipient. The orchestrator stores one row per
(user, channel), attempts immediate delivery per row,
and returns aggregate counts so the producer knows what landed.
When you'd use this
Business events that deserve every available rail: incident pages, onboarding milestones, multi-recipient activity events ("Bob commented on Task 42 — page Alice and Carol via in-app + email + SMS").
The wire call
curl -sX POST http://notify:8081/elloloop.notify.v1.NotificationInternalService/Notify \ -H 'Content-Type: application/json' \ -H "X-Notify-Internal-Token: $NOTIFY_INTERNAL_TOKEN" \ -d '{ "tenantId": "acme", "notificationId": "incident-2026-05-27-001", "userIds": ["alice", "bob", "carol"], "channels": [ "DELIVERY_CHANNEL_IN_APP", "DELIVERY_CHANNEL_EMAIL", "DELIVERY_CHANNEL_SMS" ], "subjectRef": "incident:i-1287", "subjectType": "incident", "title": "P1: checkout service unavailable", "body": "Pagerduty incident i-1287 fired at 05:14 UTC. Runbook: https://example.com/rb", "addresses": { "alice": { "byChannel": { "email": "alice@example.com", "sms": "+15555550199" } }, "bob": { "byChannel": { "email": "bob@example.com" } }, "carol": { "byChannel": { "sms": "+15555550158" } } } }'What happens, row by row
| User | Channel | Address resolved? | Provider configured? | Result |
|---|---|---|---|---|
| alice | in_app | yes (user id) | yes (auto) | delivered if live, else pending |
| alice | yes (alice@example.com) | yes | delivered | |
| alice | sms | yes (+15555550199) | yes | delivered |
| bob | in_app | yes | yes | delivered if live, else pending |
| bob | yes (bob@example.com) | yes | delivered | |
| bob | sms | no | yes | pending (no address) |
| carol | in_app | yes | yes | delivered if live, else pending |
| carol | no | yes | pending (no address) | |
| carol | sms | yes (+15555550158) | yes | delivered |
The orchestrator stores all nine rows (3 users × 3 channels) and
reports a NotifyResponse like:
{"delivered": 5, "pending": 4, "failed": 0}
Two of those pending are "no address"; the other two
are "user offline for in-app". Both are recoverable: the missing
addresses can be supplied in a future retry; the in-app rows will
surface on the next StreamEvents open via
GetNotifications.
Idempotency across fanout
The single notification_id covers all nine
rows. A second call with the same
(tenant, notification_id) but a wider channel list
will not create new rows for the already-stored
(user, channel) pairs — the Store's idempotency key is
(tenant, user, notification_id), and that triple
matches per row. The second call's NotifyResponse will report
those rows as pending (skipped at the storage layer)
or re-deliver via the providers if a provider is now available
where one wasn't before.
The practical recommendation: choose the channel list once,
at the producer, when the event is generated. Retries should not
change the channel set; if you need to add a channel, mint a new
notification_id for the addition.
Empty channels = all enabled
Omitting channels tells the orchestrator to use every
channel that has both an active provider AND a destination for the
user. For the example above with all three providers configured,
leaving channels off produces the same nine rows.
The difference matters when a provider is disabled: with
channels=[in_app, email, sms] on a deployment where
SMS is disabled, the SMS rows are stored as pending
(recovery is "wire a Twilio provider, re-run"). With empty
channels on the same deployment, SMS isn't part of the
effective set at all — no row stored, no recovery story needed.
Recovery via the inbox
A user who logs in after the fanout opens
GetNotifications and sees every row addressed to them,
regardless of delivery outcome. The in-app row with
StatusPending looks the same in the UI as a delivered
row — it just hadn't been pushed live yet.
Producer-side shape
For a real producer with a user-store integration, the addresses map is usually built from a single query over your user table:
func sendIncidentPage(ctx context.Context, c *NotifyClient, incidentID string, userIDs []string) error { contacts, err := userStore.LookupContacts(ctx, userIDs) if err != nil { return fmt.Errorf("lookup contacts: %w", err) }
addrs := make(map[string]*notifyv1.ChannelAddresses, len(contacts)) for uid, c := range contacts { m := map[string]string{} if c.Email != "" { m["email"] = c.Email } if c.Phone != "" { m["sms"] = c.Phone } if len(m) > 0 { addrs[uid] = ¬ifyv1.ChannelAddresses{ByChannel: m} } }
_, err = c.Notify(ctx, ¬ifyv1.NotifyRequest{ TenantId: "acme", NotificationId: "incident-" + incidentID, UserIds: userIDs, Channels: []notifyv1.DeliveryChannel{ notifyv1.DeliveryChannel_DELIVERY_CHANNEL_IN_APP, notifyv1.DeliveryChannel_DELIVERY_CHANNEL_EMAIL, notifyv1.DeliveryChannel_DELIVERY_CHANNEL_SMS, }, SubjectRef: "incident:" + incidentID, SubjectType: "incident", Title: "P1: checkout service unavailable", Body: "Pagerduty incident " + incidentID + " fired.", Addresses: addrs, }) return err}Partial failure
A provider error on one (user, channel) row records that row as
StatusFailed and increments NotifyResponse.failed;
the rest of the fan-out continues. Storage errors abort the call
immediately (the platform cannot guarantee the remaining rows would
be stored either).
Producer-side strategy: log non-zero
failed for ops alerts; do not retry the whole call
automatically — the idempotency key ensures the stored rows aren't
duplicated, but it does not re-attempt the failed provider sends.
To re-send only the failed channel for the affected users, mint a
fresh notification_id with the narrowed audience.