Skip to content

CONCEPT Cited by 1 source

Batch-only component for IPC amortization

Batch-only component for IPC amortization is the design choice to expose only the batch variant (BatchInput / BatchProcessor / BatchOutput) of a component type across a cross-process boundary, deliberately hiding the single-message variant. The goal is to make the per-message share of IPC cost (serialization + socket traversal + context switch) small by dividing it over the N messages in the batch — so the architecturally necessary plumbing does not dominate the per-record cost.

Canonical wiki instance

Redpanda Connect dynamic plugins (2025-06-17): "We use batch components exclusively to amortize the cost of cross-process communication." (Source: sources/2025-06-17-redpanda-introducing-multi-language-dynamic-plugins-for-redpanda-connect). Redpanda Connect normally offers both single-message (Input, Processor, Output) and batch (BatchInput, BatchProcessor, BatchOutput) component types. For the dynamic-plugin framework — which runs plugins as separate OS subprocesses over gRPC on a Unix socket — only the batch types are exposed. Single-message types are deliberately excluded.

The underlying cost model

A cross-process call over gRPC-on-Unix-socket pays, per call, roughly:

  • Protobuf encode on the host side
  • Socket write + kernel scheduling hop
  • Context switch to the plugin process
  • Protobuf decode on the plugin side
  • (Return direction: reverse of the above)

Call this fixed cost C_call. The per-call cost of the actual plugin logic is C_plugin. The plugin logic also has a per-message component C_msg * N where N is the batch size.

  • Single-message API: effective cost per message ≈ C_call + C_plugin + C_msg. The IPC overhead C_call is paid on every message. If C_call is tens or hundreds of microseconds (typical for Unix-socket gRPC with protobuf), and the plugin logic is cheap (C_plugin, C_msg small), then IPC dominates.
  • Batch API: cost for a batch of N ≈ C_call + C_plugin + N * C_msg. Effective per-message cost ≈ C_call / N + C_plugin / N + C_msg. As N grows, the fixed cost C_call + C_plugin is amortized toward zero on a per-message basis.

For plugins where the per-message work is cheap (e.g. a simple transformation), this amortization is the difference between a usable system and an IPC-bound toy.

Why only batch — why not offer single-message as a trap

By not exposing the single-message API at all, the plugin author cannot accidentally pick the shape that will bottleneck on IPC overhead. The API enforces the batch-oriented invocation pattern as the only option. This is a kind of architectural forcing function — not leaving the bad choice on the table.

It also simplifies the gRPC service definition: only three services (BatchInput, BatchProcessor, BatchOutput) rather than six. Simpler protocol, less surface to stabilize across Redpanda Connect minor versions.

Relation to existing batching concepts

  • Batching latency tradeoff — batching amortizes fixed costs (network round trip, disk I/O, cross-process hop) across many messages, at the cost of added latency for the first message in a batch. The IPC amortization case is a special case: the amortized fixed cost is the cross-process hop specifically.
  • Effective batch size — the actual N observed at the downstream side, given producer linger settings and broker-side coalescing. The Redpanda Connect dynamic-plugin shape puts the plugin inside the batch pipeline; whatever N the pipeline produces, the plugin sees.

Generalization beyond Redpanda Connect

Any boundary with a fixed per-call overhead benefits from a batch-only component API:

  • Serverless function calls with cold-start cost — prefer functions that accept batches, so the cold-start is amortized over many records.
  • Remote GPU inference services — round trip + tokenizer warmup dominates if calls are per-sample; batch requests amortize.
  • Any cross-language / FFI boundary — Python↔C++, Go↔C, Kotlin↔JVM-native. The transition is fixed cost; batch the payload.
  • Rate-limited external APIs — per-call cost includes an HTTP round trip; always prefer batch endpoints when they exist.

When the design fails

  • Latency-sensitive workloads where N=1 is a hard constraint. If the system cannot wait for a batch to accumulate, amortization doesn't save you. A synchronous RPC handler with p99 < 10ms budget can't batch arbitrarily. For Redpanda Connect specifically, such workloads should use compiled Go plugins ([[patterns/compiled-vs-dynamic-plugin- tradeoff]]), not dynamic plugins.
  • When batches are routinely small. If N is often 1 or 2 (bursty arrival with short linger windows), the amortization term C_call / N is not small. Tuning batch / linger settings upstream of the plugin is a prerequisite for the design to pay off.

Seen in

Last updated · 470 distilled / 1,213 read