PATTERN Cited by 1 source
Cache in Lambda execution context¶
When to use¶
Your AWS Lambda function is called frequently enough that warm-start reuse is the common case, and its hot path repeatedly fetches the same (or predictably-cacheable) data from a downstream service — DynamoDB, a remote HTTP API, a config service, etc.
The pattern¶
AWS Lambda reuses the same execution context (same container, same process) across warm invocations. Objects declared outside the handler function persist across invocations on the same execution context.
# module scope — runs once per cold start, survives warm starts
_cache = {}
def handler(event, _ctx):
key = event['key']
if key in _cache:
return _cache[key] # ~0ms hit
value = dynamodb_get(key) # ~10-15ms cold fetch
_cache[key] = value
return value
Or more typically, memoize with TTL:
_cache = {} # {key: (value, expires_at)}
def handler(event, _ctx):
key = event['key']
hit = _cache.get(key)
if hit and hit[1] > time.time():
return hit[0]
value = dynamodb_get(key)
_cache[key] = (value, time.time() + TTL_SECONDS)
return value
Canonical datapoint¶
From the High Scalability Dec-2022 roundup, quoting @dfrasca80:
"I have a #lambda written in #rustlang with millions of requests per day that does worse-case scenarios 2 GetItem and 1 BatchGetItem to #dynamodb. It usually runs between 10ms & 15ms. I decided to cache the result in the lambda execution context and now < 3ms."
3–5× p50 latency reduction from a single trivial change on a hot Lambda path.
Trade-offs¶
- Cache is per-execution-context, not shared across concurrent Lambdas. If you have N concurrent Lambda containers, you have N independent caches with their own fill patterns. For caches that must be globally consistent across all Lambda instances, use ElastiCache / DynamoDB / KV downstream.
- Staleness bound = context lifetime — AWS may recycle a warm execution context after minutes to hours of no traffic, or at any time for capacity reasons. A value cached without a TTL can be arbitrarily stale within the context's lifetime.
- Memory cost: the cache shares the Lambda's memory
allocation. On a
128 MBLambda you don't get a lot of headroom; size the Lambda up if you intend to cache. - Cold-start miss penalty: every cold start pays the full fetch latency; the pattern is only a win when warm-start reuse dominates.
Why it's on this wiki¶
A high-leverage, near-zero-complexity Lambda optimization that routinely cuts p50 latency by 3–5× when hot data is reusable across invocations. The failure to apply it is one of the most common sources of "why is my Lambda slow" performance complaints.