CONCEPT Cited by 1 source
Async write buffer¶
Definition¶
An async write buffer is an in-process buffer (a channel, queue, or ring) sitting between a synchronous request path (e.g. an HTTP handler) and a downstream durable write (e.g. a Kafka broker). Requests are acknowledged as soon as their payload lands in the buffer, not when the downstream write completes. A background worker drains the buffer asynchronously, usually on a time-based or size-based flush trigger.
Why¶
The pattern trades durability at the tail for tail latency:
- Synchronous-ack path: HTTP response time includes broker round-trip + fsync + replication ack. Predictable but slow.
- Async-buffered path: HTTP response time is O(channel-send). Buffer drains off-path. If the process dies before the next flush, buffered records are lost.
Hotstar (2024) frames the choice explicitly:
"Synchronous: Wait for the acknowledgment that the message is written before sending a success response to clients [...] If your data is transactional or cannot suffer any loss, this approach is preferable. Asynchronous: Write the message to a local buffer and respond with success to clients. [...] the downside is that if not handled properly, this could result in data loss." (Source: sources/2024-03-26-highscalability-capturing-a-billion-emojions)
For their emoji-swarm system: "we need very low latency and data loss in rare scenarios is not a big concern (although we haven't seen any so far)" — async won.
Flush triggers¶
Practical implementations fire a flush on whichever comes first of two triggers:
- Time-based: every N milliseconds (Hotstar:
500 ms). - Size-based: every K records (Hotstar:
20,000messages).
Time-based bounds worst-case per-record latency when traffic is low. Size-based bounds buffer memory when traffic is high. Together they bound both tails.
Language primitives¶
- Go: channel + goroutine — "Messages to be produced are written to a Channel. A Producer runs in the background as a Goroutine and flushes the data periodically to Kafka." Canonical in the Hotstar write-up.
- Java / JVM:
LinkedBlockingQueue+ a dedicated flusher thread, or the built-inlinger.ms+batch.sizeknobs on the KafkaKafkaProducer(which implements this pattern internally). - Node: Promise queue drained by
setImmediate/setInterval. - Rust / Tokio:
tokio::sync::mpscchannel + a spawned task.
Delivery semantics¶
Async write buffers are at-most-once by construction. The
buffered records are not durable until the next flush. If the
process dies, up to min(flush_interval, buffer_size) records are
lost with no recovery signal.
Upgrading to at-least-once or exactly-once requires either (a) a persistent buffer (write-ahead log on local disk) or (b) a synchronous ack path for records that need it. Applications frequently split traffic: emojis async, votes sync.
See concepts/at-most-once-delivery for the delivery-semantics class.
Seen in¶
- sources/2024-03-26-highscalability-capturing-a-billion-emojions — Hotstar emoji-swarm HTTP ingest: Go channel + background goroutine flushing to Kafka every 500 ms or 20K messages. Canonical wiki instance; the article names the 500ms / 20K constants explicitly.
Related¶
- systems/kafka — the typical downstream durable sink.
- concepts/at-most-once-delivery — the delivery-semantics class this pattern implements.
- patterns/async-buffered-kafka-produce — the concrete goroutine+channel+Kafka-producer pattern this concept instantiates.