Last updated 2026-05-28

EntDB store

The EntDB driver maps notify's Store contract onto tenant-shard-db's SDK. v0.1 targets v2 schema-aware mode (ADR-031) and pins to v2.0.5 — the v2 server materializes the notify schema from proto descriptors and enforces composite uniqueness, which is what makes the concurrent-create conformance test pass without a service-level mutex.

When you'd use this

Pick EntDB if you already run tenant-shard-db for another service (e.g. identity) and want notify to share the operational story — same image management, same WAL pipeline, same scaling story. For greenfield deployments, the Postgres driver is the more common starting point.

Configuration

-e NOTIFY_STORE_DRIVER=entdb \
-e NOTIFY_ENTDB_ADDRESS=entdb.internal:50051 \
-e NOTIFY_ENTDB_TENANT_ID=notify-prod

NOTIFY_ENTDB_TENANT_ID is the EntDB storage shard — one notify container talks to one EntDB tenant. All notify-application tenants live under that one EntDB tenant, differentiated by the tenant_id field on each row.

Schema (proto-first)

Schema is declared as proto messages in proto/entdb_notify/notify.proto. type_id values are local to notify's tenant-shard-db instance (it runs its own EntDB deployment), so the type ids do not need to coordinate with identity's.

proto/entdb_notify/notify.proto
message UserNotification {
option (entdb.node) = {
type_id: 1
data_policy: DATA_POLICY_PERSONAL
subject_field: "user_id"
description: "Per-user copy of one notification"
};
string notification_id = 1 [(entdb.field) = { required: true, indexed: true }];
string tenant_id = 2 [(entdb.field) = { required: true, indexed: true }];
string user_id = 3 [(entdb.field) = { required: true, indexed: true }];
string subject_ref = 4;
string subject_type = 5;
string title = 6;
string body = 7;
string channel = 8;
string delivery_status = 9 [(entdb.field) = { required: true, indexed: true }];
int64 created_at_ms = 10 [(entdb.field) = { kind: "timestamp", indexed: true }];
int64 delivered_at_ms = 11 [(entdb.field) = { kind: "timestamp" }];
int64 ack_at_ms = 12 [(entdb.field) = { kind: "timestamp" }];
int64 read_at_ms = 13 [(entdb.field) = { kind: "timestamp" }];
// composite_key = "<tenant>|<user>|<notification_id>"
string composite_key = 14 [(entdb.field) = { required: true, indexed: true, unique: true }];
}
message DeviceRegistration {
option (entdb.node) = {
type_id: 2
data_policy: DATA_POLICY_PERSONAL
subject_field: "user_id"
description: "A registered push endpoint for one user on one device type"
};
string tenant_id = 1 [(entdb.field) = { required: true, indexed: true }];
string user_id = 2 [(entdb.field) = { required: true, indexed: true }];
string device_type = 3 [(entdb.field) = { required: true }];
string token = 4 [(entdb.field) = { required: true }];
int64 created_at_ms = 5 [(entdb.field) = { kind: "timestamp" }];
int64 last_active_ms = 6 [(entdb.field) = { kind: "timestamp" }];
string composite_key = 7 [(entdb.field) = { required: true, indexed: true, unique: true }];
}

The composite_key field on each node is the unique index the driver hits via sdk.GetByKey for the idempotency guard. v2 server enforces the unique: true annotation, closing the composite-uniqueness race the v1.32.1 canary tests reproduced.

composite_key is length-prefix encoded

Why this design. The composite key string is built by length-prefix encoding each part rather than concatenating with a separator. A naive "u1|n1|n2" scheme collides for two distinct triples that escape differently (e.g. one with a literal | in the user id). The driver's lpEncode helper writes <len>:<part> for each piece so the serialization is injective. The KeyEdge/NotificationID_SeparatorBytesDoNotCollide conformance test pins this invariant — drivers that get it wrong would silently treat two different rows as the same.

Schema-aware mode + composite uniqueness

The container wires the SDK with the registered schema at boot:

client, err := sdk.NewClient(
cfg.Store.EntDBAddress,
sdk.WithSchema(entdb.SchemaMessages()...),
)

entdb.SchemaMessages() returns the two proto descriptors declared above. The v2 server materializes the schema from those descriptors on the first ExecuteAtomic call and enforces every annotation including unique: true. Concurrent CreateNotification calls with the same composite_key now produce exactly one row.

The same-key race resolution

The Go SDK's Plan.Commit does not expose wait_applied today (upstream item #U1, documented in store/entdb/CONFORMANCE.md), so the loser of a unique-constraint race receives a success=true receipt with a pre-allocated node id that never actually lands. The driver reconciles in code:

// commitCreateUnique helpers.go (abridged)
submittedID, _ := commitCreateNoWait(ctx, actor, msg) // server pre-allocates a UUID
canonical, _ := findByKeyWithRetry(ctx, actor, key, val) // bounded sdk.GetByKey poll
won := canonical == submittedID // winner: ids match; loser: differ

CreateNotification returns (canonicalID, won) so the caller learns whether it was the actual writer. UpsertDevice additionally runs a no-wait UpdateFields against the canonical row in the loser branch so each racer's token / last_active_ms gets a chance to land (preserving memory's "last writer wins" semantics).

Run conditions

  • Image: ghcr.io/elloloop/tenant-shard-db:2.0.5 (pin in docker-compose or your Helm chart).
  • SDK: github.com/elloloop/tenant-shard-db/sdk/go/entdb/v2 v2.0.5.
  • Server flags: -addr=:50051 -data-dir=/tmp/entdb -wal-backend=memory (-data-dir is required on v2).
  • Schema-aware mode enabled via sdk.WithSchema(entdb.SchemaMessages()...).

Conformance results

24/24 PASS on v2.0.5 — including the two composite-uniqueness canaries that were intentionally red on v1.32.1 (they flipped when the v2 schema-aware unique enforcement landed). See store/entdb/CONFORMANCE.md for the per-subtest table.

Running the suite locally

# Boot a local EntDB on a free port.
docker run -d --rm --name notify-entdb-test -p 50061:50051 \
ghcr.io/elloloop/tenant-shard-db:2.0.5 \
-addr=:50051 -data-dir=/tmp/entdb -wal-backend=memory
# Run the suite (build tag gates real-entdb tests).
NOTIFY_ENTDB_ADDRESS=localhost:50061 \
go test -tags=realentdb ./store/entdb/... -race -count=1 -v -timeout=5m
docker stop notify-entdb-test

What the suite does NOT cover

Intentional omissions documented for transparency:

  • Cross-tenant ACL. The suite uses a fresh tenant per subtest, so it never observes cross-tenant data. The driver hard-codes system:notify as the actor for every read and write; cross-tenant isolation is enforced by the tenantID argument the driver passes to the SDK transport.
  • Long-lived connections. SDK client is reused across subtests; reconnection behaviour is tested upstream.
  • Server crash / WAL replay. The driver assumes the server is available; failover is the operator's concern.

Related