SYSTEM Cited by 1 source
RabbitMQ¶
RabbitMQ is an open-source message broker implementing AMQP 0-9-1 (and native extensions for MQTT / STOMP / Streams), originally from Rabbit Technologies (now VMware). Widely deployed as a general-purpose durable work-queue / pub-sub fabric with rich exchange-and-binding routing, per-consumer acknowledgement, dead-letter queues, and native clustering.
Properties relevant to system design¶
- Exchange + binding routing — direct / topic / fanout / headers exchange types; flexible many-to-many producer-consumer wiring.
- Push-based delivery — the broker pushes messages to consumers holding unfilled prefetch windows; consumers don't peek, they receive.
- Prefetch = request-count window — a consumer's
prefetch_count(orprefetch_sizein bytes) is the classic knob for how many messages may be in-flight un-acked. Not a token-count or payload-attribute budget. - Per-message ack / nack with redelivery — at-least-once delivery; redelivery on nack or consumer disconnect.
- Quorum queues / mirrored queues / Streams — durability and high-availability modes.
Role on the wiki¶
RabbitMQ is a canonical example of a message broker whose native batching knobs are not sufficient for application-specific compute-batching disciplines such as token-count batching for GPU inference. Three specific gaps named in the 2025-12-18 Voyage AI post:
- Request-count prefetch — doesn't compose with a per-payload token count.
- Push delivery — consumers can't peek and selectively claim by a caller-computed budget.
- No atomic peek + conditional claim primitive — the token-count-batching scheduler's required single-step operation doesn't exist.
Both practical workarounds are on the wiki as separate patterns:
- patterns/lightweight-aggregator-in-front-of-broker — insert a small service between the broker and workers that consumes at broker-native semantics and batches inside.
- patterns/atomic-conditional-batch-claim — use a store like Redis with Lua instead, which natively supports the primitive.
Seen in¶
- 2025-12-18 Voyage AI / MongoDB — Token-count-based batching — "RabbitMQ's prefetch is request-count-based, and messages are pushed to consumers, so there's no efficient way to peek and batch requests by Σ token_count_i." Named alongside Kafka as the two general-purpose brokers whose batching semantics don't fit token-count batching natively. (sources/2025-12-18-mongodb-token-count-based-batching-faster-cheaper-embedding-inference)
- 2024-04-22 Zalando — Enhancing Distributed System Load Shedding with TCP Congestion Control Algorithm — RabbitMQ as the publisher-saturation signal substrate + internal backbone of the Zalando Communication Platform. The platform routes customer-communication work (order confirmations, marketing pushes, brand alerts) through RabbitMQ between its microservices; under load the broker's own flow-control slows publishers, which Zalando reads via P50 publish latency + publish-exception count and uses as the input to per-event-type AIMD throttles at the Stream Consumer. The post explicitly cites the operational rationale — "with a smaller queue size in RabbitMQ we follow best practices" — which canonicalizes a production instance of RabbitMQ's queue-depth-as-performance-risk property driving architectural choices one layer upstream: shed at ingestion (concepts/load-shedding-at-ingestion) so RabbitMQ queues stay light. Also names RabbitMQ's back-pressure mechanism directly: "RabbitMQ is able to apply back-pressure when slow consumers are detected... RabbitMQ will slow down the publish rate which the publisher will experience in the increase in the publish time." See concepts/publish-latency-as-congestion-signal, patterns/aimd-ingestion-rate-control. ()
Stub — no deeper RabbitMQ-internals source yet ingested.