PATTERN Cited by 1 source
Async-buffered Kafka produce¶
Shape¶
Writes from the hot request path land in an in-process buffer (a Go channel, a JVM blocking queue, etc.) and return immediately. A background worker (goroutine, thread, async task) drains the buffer and flushes to Kafka on whichever comes first of:
- A time-based flush interval (Hotstar:
500 ms). - A size-based batch ceiling (Hotstar:
20,000messages).
HTTP handler ──► chan msg ──► [goroutine producer] ──► Kafka broker
(returns 200 │ flush every
immediately) │ 500 ms
│ OR every 20K msgs
Why it's a pattern¶
Kafka's synchronous acks=all write takes a round-trip-plus-fsync on
the broker leader plus in-sync-replica acks. For latency-sensitive
ingest paths (single-digit-ms user-facing budgets), this is too slow
at the tail. Pushing the broker write off-path is the standard
latency fix — but the knobs have to be named explicitly, because
flushing too late loses data on crash and flushing too eagerly
defeats batching and burns broker round-trips.
The Hotstar numbers 500 ms and 20,000 messages are a reusable
template: small enough that a crash loses < half a second of
submissions, large enough that the Kafka producer gets real batches
on the wire.
Library support¶
The pattern is mostly built into mature Kafka client libraries:
confluent-kafka-go(librdkafka wrapper) —linger.ms+batch.num.messages+ internal producer queue.- Sarama (Shopify) —
Producer.Flush.Frequency+Producer.Flush.MaxMessages. - JVM
KafkaProducer—linger.ms+batch.size. The producer queue is inside the client; the application still chooses whether to wait for thesend().get()future.
The pattern question "do I wait for the returned future?" is equivalent to "do I take the sync or async fork?" — the library already implements the buffer; the application decides semantics.
Constants¶
Generalised from Hotstar + other public references:
| Knob | Typical | Sets |
|---|---|---|
linger.ms / flush interval |
10 ms – 1 s (Hotstar: 500 ms) |
Worst-case latency at low traffic |
batch.size / max msgs per flush |
16 KB – 20 K msgs (Hotstar: 20,000) |
Per-request amortization; memory ceiling |
| Buffer capacity | Set to absorb N × flush-interval of peak traffic | Overflow behaviour (drop vs block) |
Failure modes¶
- At-most-once: up to
min(flush_interval, buffer_size)records lost on crash. Document this explicitly; for vote-counting or billing use a sync path or a disk-backed WAL instead. - Producer backpressure: if the broker falls behind, the in-process buffer fills. Choose: drop (at-most-once, explicit metric), block (back-pressure the HTTP handler — defeats the purpose), or spill to disk (upgrade to at-least-once).
- Silent drops on channel overflow: Go channels don't notify
senders when full unless the send is a
selectwithdefault. Instrument a dropped-message counter.
Seen in¶
- sources/2024-03-26-highscalability-capturing-a-billion-emojions
— Hotstar emoji-swarm ingest. Goroutine + channel + Confluent /
Sarama client,
500 msflush interval,20,000message ceiling. Accepted at-most-once semantics explicitly ("we need very low latency and data loss in rare scenarios is not a big concern"). Canonical wiki instance with named constants.
Related¶
- systems/kafka — the downstream broker.
- concepts/async-write-buffer — the generic concept this pattern instantiates against Kafka specifically.
- concepts/at-most-once-delivery — the delivery semantics this pattern buys latency with.
- patterns/emoji-swarm-realtime-aggregation — end-to-end pipeline where this pattern is the ingest stage.