Skip to content

PATTERN Cited by 1 source

Async SQS → Lambda for interactive optimisation

Problem

Interactive optimisation workloads — e.g. a partner portal where users tweak parameters and re-run an expensive optimiser — don't fit neatly into either the batch model (too slow a feedback loop) or the synchronous-RPC model (the optimiser takes seconds to minutes; holding a request thread that long is wasteful and fragile). You need:

  • Backpressure control — if 100 partners submit what-ifs at the same time, you shouldn't OOM the optimiser or 500 to the partner.
  • Multi-tenant fairness — one partner flooding the system shouldn't block others.
  • Failure tolerance — if a compute node dies mid-run, the partner's request shouldn't vanish.
  • Serverless scaling — capacity should auto-adjust to the rate of partner interactivity.

Pattern

Insert a durable queue between the portal and the optimiser compute; fire optimiser jobs from a Lambda worker pool:

Partner portal
       │ user updates inventory settings
   [ AWS SQS ]  ← durable queue, one message per update
       │ poll
   [ AWS Lambda ]  ← worker pool, multi-threaded per invocation
       ├─ fetch SKU feature vectors from online feature store
       ├─ run optimiser with multi-threading parallelism
       ├─ store result in S3
       ├─ emit "result ready" notification to event stream
       └─ persist user's setting change to offline feature store
              (parity write-back)
Partner portal polls / subscribes and shows result

Load-bearing properties:

  • SQS as durability layer. If Lambda concurrency caps out or a worker crashes, the message stays in the queue.
  • Lambda auto-scales with queue depth (SQS trigger → Lambda concurrency).
  • Multi-threading inside one invocation. For each inventory update, the optimiser scores multiple SKUs in parallel within one Lambda — amortising Lambda cold-start and feature-fetch cost.
  • Result delivery via S3 + notification, not RPC response. Frees the Lambda to terminate and the portal to render out-of-band.
  • Parity write-back. The user's setting change is written to the offline store too, so the next daily batch reflects it — see patterns/online-plus-offline-feature-store-parity.

Canonical instance (Zalando ZEOS)

Zalando ZEOS's replenishment recommender uses exactly this shape for its online interactive path:

"When partners update their inventory settings, we trigger an orchestrated workflow that queues each update request on AWS SQS. We then use AWS Lambda to poll the queue for updates and serve each update request asynchronously. For each inventory update, we fetch the feature vector for relevant SKUs from the online feature store, and execute the optimisation algorithm with multi-threading parallelism. Once optimal predictions have been calculated, we store the results in s3 and alert the backend systems via a notification in the event stream. Lastly, in addition to serving the online request, we also persist the inventory setting update to the offline feature store, making future offline predictions consistent."

Seen in

Last updated · 501 distilled / 1,218 read