Skip to content

PATTERN Cited by 1 source

Stream rebatch for downstream batch API

When to use

You have:

  • An upstream component that emits per-record events on a streaming platform (one event per user, per product, per task, etc.).
  • A downstream API that supports — or requires — calls in batches up to N items, and rate-limits per call.
  • A throughput / cost problem if you naively dispatch one downstream call per upstream event.

The downstream API's batch shape is the external constraint; the upstream event stream's per-record shape is the internal representation. Rebatching mediates between them.

The pattern

Insert a stream consumer between the per-record emitter and the batch-API caller that:

  1. Consumes per-record events from the streaming platform.
  2. Groups them into batches of up to N (matching the downstream API's max batch size).
  3. Forwards each batch as a single event / API call to the downstream caller.

Crucially, the batch size N is dictated by the downstream API, not by the consumer's throughput / latency optimum. This distinguishes the pattern from generic micro-batching, where N is tuned for queue-amortization.

Canonical instance — Instacart Storefront Pro marketing

From the 2026-05-14 source (sources/2026-05-14-instacart-scaling-personalized-marketing-for-multi-tenant-commerce-platforms):

*"Instacart's Campaigns Engine emits one event per user after audience evaluation and personalization setup. Left as-is, that would require the CRM Service to process users one at a time, creating unnecessary network overhead and placing avoidable pressure on downstream systems.

Those costs mattered because our third-party provider imposes strict API constraints. For example, requests are rate-limited per retailer, and individual send APIs support batches of up to 50 users per call. Processing users individually would have made large-scale campaign delivery both slower and more expensive.

To address that, we introduced a stream consumer that rebatches customer-level campaign events before handing them off to the CRM Service. Instead of processing one user per request, the system groups users into batches of up to 50 and sends them downstream together.

This change significantly improved throughput, reduced API pressure, and aligned our internal processing model with the capabilities of the provider."*

The pipeline:

   Campaigns Engine ──── 1 event per user ──── stream
                                         stream consumer
                                         (batches of up to 50)
                                            CRM Service
                                       third-party batch API
                                       (≤50 users per call)

What problems this solves

  1. Throughput-vs-quota waste. Without rebatching, each per-record event becomes one API call; if the downstream accepts 50/call, the platform burns 50× more quota than necessary.
  2. Network overhead. N HTTP round-trips become N/50.
  3. Per-record processing tax. Personalization / serialization / vendor-auth overhead amortizes across the batch.
  4. Concurrency control simplification. Rate-limit-aware throttling can operate at the batch grain rather than per-record.

What problems it does NOT solve

  • Latency floors. A batch can't be sent until it's full (or a max-wait timer expires). Per-record latency rises by up to the batch fill time.
  • Idempotency. The downstream batch caller still needs to deduplicate against at-least-once delivery semantics — the batch is one logical event, not 50.
  • Per-tenant fairness. If the rebatcher mixes tenants in a batch, per-tenant quota tracking gets harder. The standard fix is per-tenant batches (one stream-consumer key per tenant, batch-by-tenant downstream), so each batch is scoped to one tenant's workspace and rate-limit budget. This is implicit in the Instacart shape since each batch is routed to one workspace.

Variations

  • Time-windowed batching — flush after N items or T ms, whichever comes first. Bounds latency.
  • Size-windowed batching — bytes-per-batch threshold; matters when downstream APIs cap payload size more than item count.
  • Per-key batching — group by tenant / user / partition key first, then batch within key. The default for multi-tenant systems with per-tenant rate limits.

Relation to other batching patterns

Implementation notes

  • Use a streaming framework that supports keyed micro-batching (Kafka Streams, Flink, Kinesis Client Library with aggregation, custom consumer with in-memory window).
  • Set the batch size to the downstream API's max — never less, unless latency budget forces a smaller window.
  • Emit a batch-level idempotency key so the downstream caller can deduplicate redeliveries; per-record idempotency inside the batch is a separate question.
  • Track per-batch-tenant routing if the downstream API is per-tenant rate-limited; one batch should hit one tenant's quota.

Caveats

  • The 2026-05-14 source doesn't disclose the streaming substrate; the pattern itself is substrate-agnostic.
  • Without a max-wait timer, low-volume tenants can starve in partially-full batches.
  • Mixing record types in a batch (e.g. email + push) defeats the pattern; the downstream typically batches per channel.

Seen in

Last updated · 542 distilled / 1,571 read