Last updated 2026-05-28

Multi-Channel Fanout

One Notify call, many recipients, many channels per recipient. The orchestrator stores one row per (user, channel), attempts immediate delivery per row, and returns aggregate counts so the producer knows what landed.

When you'd use this

Business events that deserve every available rail: incident pages, onboarding milestones, multi-recipient activity events ("Bob commented on Task 42 — page Alice and Carol via in-app + email + SMS").

The wire call

curl -sX POST http://notify:8081/elloloop.notify.v1.NotificationInternalService/Notify \
  -H 'Content-Type: application/json' \
  -H "X-Notify-Internal-Token: $NOTIFY_INTERNAL_TOKEN" \
  -d '{
    "tenantId": "acme",
    "notificationId": "incident-2026-05-27-001",
    "userIds": ["alice", "bob", "carol"],
    "channels": [
      "DELIVERY_CHANNEL_IN_APP",
      "DELIVERY_CHANNEL_EMAIL",
      "DELIVERY_CHANNEL_SMS"
    ],
    "subjectRef": "incident:i-1287",
    "subjectType": "incident",
    "title": "P1: checkout service unavailable",
    "body": "Pagerduty incident i-1287 fired at 05:14 UTC. Runbook: https://example.com/rb",
    "addresses": {
      "alice": {
        "byChannel": {
          "email": "alice@example.com",
          "sms":   "+15555550199"
        }
      },
      "bob": {
        "byChannel": {
          "email": "bob@example.com"
        }
      },
      "carol": {
        "byChannel": {
          "sms": "+15555550158"
        }
      }
    }
  }'

What happens, row by row

User	Channel	Address resolved?	Provider configured?	Result
alice	in_app	yes (user id)	yes (auto)	delivered if live, else pending
alice	email	yes (alice@example.com)	yes	delivered
alice	sms	yes (+15555550199)	yes	delivered
bob	in_app	yes	yes	delivered if live, else pending
bob	email	yes (bob@example.com)	yes	delivered
bob	sms	no	yes	pending (no address)
carol	in_app	yes	yes	delivered if live, else pending
carol	email	no	yes	pending (no address)
carol	sms	yes (+15555550158)	yes	delivered

The orchestrator stores all nine rows (3 users × 3 channels) and reports a NotifyResponse like:

{"delivered": 5, "pending": 4, "failed": 0}

Two of those pending are "no address"; the other two are "user offline for in-app". Both are recoverable: the missing addresses can be supplied in a future retry; the in-app rows will surface on the next StreamEvents open via GetNotifications.

Idempotency across fanout

The single notification_id covers all nine rows. A second call with the same (tenant, notification_id) but a wider channel list will not create new rows for the already-stored (user, channel) pairs — the Store's idempotency key is (tenant, user, notification_id), and that triple matches per row. The second call's NotifyResponse will report those rows as pending (skipped at the storage layer) or re-deliver via the providers if a provider is now available where one wasn't before.

The practical recommendation: choose the channel list once, at the producer, when the event is generated. Retries should not change the channel set; if you need to add a channel, mint a new notification_id for the addition.

Empty channels = all enabled

Omitting channels tells the orchestrator to use every channel that has both an active provider AND a destination for the user. For the example above with all three providers configured, leaving channels off produces the same nine rows.

The difference matters when a provider is disabled: with channels=[in_app, email, sms] on a deployment where SMS is disabled, the SMS rows are stored as pending (recovery is "wire a Twilio provider, re-run"). With empty channels on the same deployment, SMS isn't part of the effective set at all — no row stored, no recovery story needed.

Recovery via the inbox

A user who logs in after the fanout opens GetNotifications and sees every row addressed to them, regardless of delivery outcome. The in-app row with StatusPending looks the same in the UI as a delivered row — it just hadn't been pushed live yet.

Producer-side shape

For a real producer with a user-store integration, the addresses map is usually built from a single query over your user table:

func sendIncidentPage(ctx context.Context, c *NotifyClient, incidentID string, userIDs []string) error {
    contacts, err := userStore.LookupContacts(ctx, userIDs)
    if err != nil {
        return fmt.Errorf("lookup contacts: %w", err)
    }

    addrs := make(map[string]*notifyv1.ChannelAddresses, len(contacts))
    for uid, c := range contacts {
        m := map[string]string{}
        if c.Email != ""  { m["email"] = c.Email }
        if c.Phone != ""  { m["sms"]   = c.Phone }
        if len(m) > 0 {
            addrs[uid] = &notifyv1.ChannelAddresses{ByChannel: m}
        }
    }

    _, err = c.Notify(ctx, &notifyv1.NotifyRequest{
        TenantId:       "acme",
        NotificationId: "incident-" + incidentID,
        UserIds:        userIDs,
        Channels: []notifyv1.DeliveryChannel{
            notifyv1.DeliveryChannel_DELIVERY_CHANNEL_IN_APP,
            notifyv1.DeliveryChannel_DELIVERY_CHANNEL_EMAIL,
            notifyv1.DeliveryChannel_DELIVERY_CHANNEL_SMS,
        },
        SubjectRef:  "incident:" + incidentID,
        SubjectType: "incident",
        Title:       "P1: checkout service unavailable",
        Body:        "Pagerduty incident " + incidentID + " fired.",
        Addresses:   addrs,
    })
    return err
}

Partial failure

A provider error on one (user, channel) row records that row as StatusFailed and increments NotifyResponse.failed; the rest of the fan-out continues. Storage errors abort the call immediately (the platform cannot guarantee the remaining rows would be stored either).

Producer-side strategy: log non-zero failed for ops alerts; do not retry the whole call automatically — the idempotency key ensures the stored rows aren't duplicated, but it does not re-attempt the failed provider sends. To re-send only the failed channel for the affected users, mint a fresh notification_id with the narrowed audience.