Skip to content

PATTERN Cited by 1 source

Cache in Lambda execution context

When to use

Your AWS Lambda function is called frequently enough that warm-start reuse is the common case, and its hot path repeatedly fetches the same (or predictably-cacheable) data from a downstream service — DynamoDB, a remote HTTP API, a config service, etc.

The pattern

AWS Lambda reuses the same execution context (same container, same process) across warm invocations. Objects declared outside the handler function persist across invocations on the same execution context.

# module scope — runs once per cold start, survives warm starts
_cache = {}

def handler(event, _ctx):
    key = event['key']
    if key in _cache:
        return _cache[key]        # ~0ms hit
    value = dynamodb_get(key)      # ~10-15ms cold fetch
    _cache[key] = value
    return value

Or more typically, memoize with TTL:

_cache = {}  # {key: (value, expires_at)}

def handler(event, _ctx):
    key = event['key']
    hit = _cache.get(key)
    if hit and hit[1] > time.time():
        return hit[0]
    value = dynamodb_get(key)
    _cache[key] = (value, time.time() + TTL_SECONDS)
    return value

Canonical datapoint

From the High Scalability Dec-2022 roundup, quoting @dfrasca80:

"I have a #lambda written in #rustlang with millions of requests per day that does worse-case scenarios 2 GetItem and 1 BatchGetItem to #dynamodb. It usually runs between 10ms & 15ms. I decided to cache the result in the lambda execution context and now < 3ms."

3–5× p50 latency reduction from a single trivial change on a hot Lambda path.

Trade-offs

  • Cache is per-execution-context, not shared across concurrent Lambdas. If you have N concurrent Lambda containers, you have N independent caches with their own fill patterns. For caches that must be globally consistent across all Lambda instances, use ElastiCache / DynamoDB / KV downstream.
  • Staleness bound = context lifetime — AWS may recycle a warm execution context after minutes to hours of no traffic, or at any time for capacity reasons. A value cached without a TTL can be arbitrarily stale within the context's lifetime.
  • Memory cost: the cache shares the Lambda's memory allocation. On a 128 MB Lambda you don't get a lot of headroom; size the Lambda up if you intend to cache.
  • Cold-start miss penalty: every cold start pays the full fetch latency; the pattern is only a win when warm-start reuse dominates.

Why it's on this wiki

A high-leverage, near-zero-complexity Lambda optimization that routinely cuts p50 latency by 3–5× when hot data is reusable across invocations. The failure to apply it is one of the most common sources of "why is my Lambda slow" performance complaints.

Seen in

Last updated · 319 distilled / 1,201 read