Skip to content
Deep Dive

AI Agent JetStream Broadcast Semantics

A shared durable can quietly turn a broadcast signal into one-replica delivery.

Deep Dive10 min readMar 2026
TL;DR
  • -Queue semantics and broadcast semantics must be explicit. Mixing them silently drops signals in HA setups.
  • -Cordum returns empty durable name for broadcast subscriptions (`queue==""`) so each replica gets its own ephemeral consumer.
  • -For queue subscriptions, Cordum generates shared durable names and uses queue groups for one-of-N delivery.
  • -This pattern is validated in repo tests for both fanout and queue-group paths.
Broadcast safety

Empty queue produces ephemeral consumer per replica, preserving fanout behavior.

Queue balancing

Non-empty queue gets stable durable name, then JetStream queue subscription handles one-of-N delivery.

Common pitfall

Shared durable on a broadcast subject routes each message to a single replica, not all replicas.

Scope

This guide focuses on JetStream subscription semantics in the Cordum bus layer. It does not cover topic taxonomy design or worker business logic.

The production problem

Teams often expect broadcast behavior and queue behavior from the same subject without encoding that intent in subscription setup.

In an HA scheduler deployment, that mistake can hide in plain sight: one replica sees every signal, the others see none, and no exception is thrown.

For AI agent control planes, this can break alert fanout, progress streams, or control signals exactly when you need multi-replica visibility.

What top results miss

SourceStrong coverageMissing piece
NATS queue groups docsQueue-group load balancing where one consumer in group handles each message.No app-level durable-name strategy for distinguishing queue and broadcast subscriptions.
NATS JetStream consumers docsDurable vs ephemeral consumers and flow-control behavior.No concrete HA scheduler pattern to avoid accidental one-replica delivery on broadcast channels.
RabbitMQ exchanges docsFanout exchange semantics: copy to every bound queue.No JetStream queue/durable naming split for mixed fanout and competing-consumer paths.

Public docs explain primitives. The gap is implementation discipline for mixed semantics in real scheduler replicas.

Cordum runtime behavior

BoundaryCurrent behaviorOperational impact
Durable subject gateCordum enables JetStream durability for submit/result/DLQ/audit, `job.*`, and `worker.*.jobs`.Critical delivery paths persist; informational subjects stay on core NATS.
Broadcast durable name`durableName(subject, queue)` returns empty when `queue` is empty.Each replica gets an ephemeral consumer and receives broadcast messages.
Queue durable nameNon-empty queue yields deterministic durable name `dur_<queue>__<subject>`.Replicas in same queue group share one consumer state and one-of-N delivery.
Subscribe branchEmpty queue uses `js.Subscribe`; queue mode uses `js.QueueSubscribe`.Delivery semantics are decided by one explicit parameter.
Broadcast test`TestBroadcastWithJetStream` asserts two replicas each receive the same DLQ message.Protects against regressions that collapse fanout into one-replica delivery.
Queue-group test`TestQueueGroupWithJetStream` verifies no duplicate total deliveries across replicas.Confirms work-distribution mode stays load-balanced and non-duplicating.

Code-level mechanics

Durable-name split by queue intent (Go)

core/infra/bus/nats.go
Go
func durableName(subject, queue string) string {
  name := sanitize(subject)
  if name == "" {
    return ""
  }
  if queue == "" {
    // broadcast: ephemeral consumer per replica
    return ""
  }
  q := sanitize(queue)
  if q == "" {
    return "dur_" + name
  }
  return "dur_" + q + "__" + name
}

Subscribe path selects semantics (Go)

core/infra/bus/nats.go
Go
opts := []nats.SubOpt{
  nats.ManualAck(),
  nats.AckExplicit(),
  nats.AckWait(b.ackWait),
  nats.MaxAckPending(natsMaxAckPending()),
  nats.MaxDeliver(maxJSRedeliveries),
}

if durable := durableName(subject, queue); durable != "" {
  opts = append(opts, nats.Durable(durable))
}

if queue == "" {
  sub, err = b.js.Subscribe(subject, cb, opts...)
} else {
  sub, err = b.js.QueueSubscribe(subject, queue, cb, opts...)
}

Tests that pin expected behavior (Go)

core/infra/bus/nats_test.go
Go
func TestDurableName(t *testing.T) {
  // broadcast with empty queue must be ephemeral
  if got := durableName("job.test.*", ""); got != "" {
    t.Fatalf("expected empty durable name for broadcast, got %s", got)
  }
}

func TestBroadcastWithJetStream(t *testing.T) {
  // two replicas subscribe with empty queue
  // publish one DLQ message
  // assert both replicas receive it
}

func TestQueueGroupWithJetStream(t *testing.T) {
  // queue group mode: total deliveries == published messages (no duplication)
}

The tests are the real guardrail. If someone simplifies durable naming later, CI should fail before rollout.

Operator runbook

broadcast_semantics_runbook.sh
Bash
# 1) Verify subscription intent by subject class
#    - Broadcast signal? use empty queue
#    - Work distribution? use non-empty queue group

# 2) Run bus semantics regression tests
cd D:/Cordum/cordum
go test ./core/infra/bus -run "TestDurableName|TestBroadcastWithJetStream|TestQueueGroupWithJetStream"

# 3) HA smoke test (broadcast path)
#    - start 2 scheduler replicas
#    - publish one DLQ event
#    - assert both replicas log handler execution

# 4) Queue path smoke test
#    - publish N job messages
#    - assert total processed == N (not 2N)

# 5) During rollout, monitor for:
#    - sudden drop in broadcast handler invocations per replica
#    - unexpected single-replica ownership of broadcast-only signals

Limitations and tradeoffs

  • - Ephemeral broadcast consumers can miss messages during disconnect/reconnect windows.
  • - Durable queue consumers favor load distribution, not per-replica observability.
  • - Semantics bugs are configuration bugs; they rarely fail loudly.
  • - Tests reduce risk, but rollout smoke checks are still required in multi-replica clusters.

If a subject must be seen by every replica, never bind it to a shared queue group. Model it as broadcast on purpose.

Next step

Audit your current subscriptions by intent:

  1. 1. Mark each subject as broadcast or queue-distributed.
  2. 2. Confirm queue parameter matches that intent in code.
  3. 3. Add a two-replica smoke test for at least one subject in each class.
  4. 4. Fail CI if those semantics tests regress.

Continue with MaxAckPending Tuning and AckWait and Dedup TTL Alignment.

Semantics over syntax

Queue and broadcast look similar in code until production traffic proves they are not.