Skip to content

SYSTEM Cited by 11 sources

AWS Lambda

AWS Lambda is AWS's function-as-a-service compute platform, launched Nov 2014. Runs customer code in stateless invocations on AWS-managed infrastructure, scaling from one request per month to thousands per second with no capacity planning and no charge for idle time. This page distils the architectural commitments Lambda made at launch and which of them held up.

Design tenets (from the 2014 PR/FAQ)

  1. Security without complexity — sandboxed execution; up-to-date patches without user action.
  2. Simple and easy — "NoOps"; few sensible defaults; self-serve.
  3. Scales up and down (to zero) — same code path from 1 req/month to 1,000 req/s. See concepts/scale-to-zero.
  4. Cost effective at any scale — fine-grained pay-per-use; no idle charge. See concepts/fine-grained-billing.
  5. AWS integration — easy to call other AWS services from a function; make other services better by being an easy backing execution engine.
  6. Reliable — public latency/availability targets (99.99% at launch), higher internal bar.

(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)

Core model

  • Stateless functions. Persistent state lives in S3 / DynamoDB / etc. Local filesystem is scratch, deleted between invocations. See concepts/stateless-compute.
  • Event-driven invocation. Triggered by AWS events (S3 PUT/COPY, DynamoDB updates/streams, SNS, SQS), HTTP requests, SDK API calls, CLI, or scheduled cron. In the 2014 doc these were specific APIs (SetUpdateHandler, SetTableHandler); they shipped as unified event-source integrations.
  • Placement engine as a compaction strategy. Per-account, the scheduler minimises the number of EC2 instances needed for that account's workload while holding latency/throughput/availability targets — bursty, heterogeneous, and short-lived workloads all pack efficiently.
  • Sandbox isolation. At launch: single-tenant EC2 instances (no two customers on the same instance — expensive but non-negotiable for security). Today: Firecracker micro-VMs with many tenants per bare metal host. See systems/firecracker, concepts/micro-vm-isolation.

Billing & limits evolution

Dimension 2014 PR/FAQ → launch Today (2024)
Billing granularity 250ms (doc) → 100ms (launch) 1ms, no minimum
Memory per function Up to 1 GB (doc) Up to 10 GB
Package size ZIP only ZIP or container image up to 10 GB (2020)
Runtimes Node only at launch (by design) Java, Python, .NET, Go, Ruby, Node, + Custom Runtimes + Layers (2018)
Cold-start mitigation "higher first time / when not used recently" SnapStart (2022) — up to 90% cold-start reduction for Java, built on Firecracker snapshots

(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)

Deliberate launch scope choices

  • Node.js only at GA to validate the programming model and observe how customers actually used the service; other runtimes followed over 4 years of iteration. See patterns/launch-minimal-runtime.
  • ZIP-only packaging as a radical simplification vs. uploading files individually. Container images were deferred to 2020 once the on-demand loading problem was solved.
  • Native library support from day one, anticipating that dependency chains would pull in C extensions even if the user didn't write any — later generalised to Lambda Layers (2018).

Operational commitments

  • Latency: published from a canary EC2 client invoking an echo function; internal measurement includes server-side latency, code caching efficacy, invoke-to-customer-code latency.
  • Throughput: per-host paging rate, CPU utilisation, network bandwidth; sustained high = add EC2 capacity or shift the fleet mix.
  • Availability: control-plane availability reported like other AWS services; per-application invoke-plane availability reported via CloudWatch.

(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)

Seen in

  • sources/2026-05-13-aws-streaming-cloudwatch-metrics-to-vpc-based-opentelemetry-collectors-using-lambdacanonical wiki naming of Lambda's Firehose-data-transform invocation contract as a VPC-private-endpoint bridge. AWS Architecture Blog (2026-05-13) describes a customer architecture where Amazon Data Firehose's HTTP endpoint destination — which "must be public — they cannot be private endpoints inside a VPC" — is bridged into a customer VPC by configuring a Lambda function as Firehose's data-transformation hook. The Lambda runs with VPC attachments, is invoked synchronously per buffered record batch ("Amazon Data Firehose buffers incoming data before synchronously invoking the Lambda function"), and pushes CloudWatch Metric Streams records onto an internal NLB → OpenTelemetry collector fleet on EC2. Canonicalises Lambda in a third structurally distinct event-source role (after the Kinesis async-event-source shape and the synchronous Function URL / API Gateway shape): the Firehose synchronous transform invocation, where the Lambda's return value is the delivery payload (not just success/fail), and the function's HTTP-egress on the synchronous path is what crosses the public/private network boundary. See patterns/firehose-lambda-transform-as-vpc-bridge.

  • Lambda as benchmarking harness, not application target. Liz van Dijk (PlanetScale, 2022-11-01) uses Lambda's horizontal-concurrency scaling as the load generator for a 1,000,000-concurrent-connection benchmark against a PlanetScale database. Test shape: 1,000 Lambda worker functions × 1,000 connections each = 1M aggregate, chosen specifically "to stay within the Lambda runtime limits" — namely the default open_files=1024 hard limit per function container and the default function_concurrency=1000 per-function cap. Each worker: opens MySQL connection using go-sql-driver/mysql, sends a handshake-probe SELECT, waits on a barrier for peers, holds for a few minutes. Aggregate ramp-to-plateau: under 2 minutes. Canonical new concept concepts/lambda-fanout-benchmark — Lambda's horizontal scaling + per-container resource constraints make it a natural fan-out load generator. First wiki framing of Lambda in the test-harness role rather than the application-runtime role. Connects to patterns/custom-benchmarking-harness and downstream patterns/two-tier-connection-pooling pattern the benchmark validates on the PlanetScale side.

  • sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years — the internal PR/FAQ that launched Lambda, with 10-year retrospective annotations.

  • sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty — named as the rotation-engine substrate for Secrets Manager-stored IAM-user credentials in the cross-partition auth IAM-user fallback pattern.
  • sources/2026-04-21-figma-enforcing-device-trust-on-code-changes — Figma runs its commit-signature verifier as a stateless Lambda behind a Function URL webhook endpoint from GitHub; canonical instance of patterns/webhook-triggered-verifier-lambda backing systems/figma-commit-signature-verification. The Lambda loads GitHub-App credentials from Secrets Manager and posts a required commit status check gating release-branch merges.
  • sources/2026-02-05-aws-convera-verified-permissions-fine-grained-authorization — Lambda serves two stateless roles on Convera's authorization hot path: (1) the Lambda authorizer in front of API Gateway that validates JWTs and calls AVP for fine-grained policy evaluation; (2) the pre-token-generation hook that enriches the Cognito access token with authorization-relevant attributes (role, tenant_id, etc.) from RDS / DynamoDB at login time. Same shape reused across customer, internal Okta-federated, machine-to-machine, and multi-tenant flows.
  • sources/2026-02-25-aws-6000-accounts-three-people-one-platform — the most extreme production instance of Lambda's scale-to- zero + per-invocation billing tenets paying off: ProGlove runs ~1,000,000 Lambda functions in production across ~6,000 tenant accounts (full account-per-tenant architecture). The function-count is economically viable only because idle functions cost nothing — the post explicitly cites Lambda as the exemplar scale-to-zero service that makes account- per-tenant feasible, contrasting with provisioned-per-resource services like EC2 ($3/mo → $3,000/mo at 1,000-account multiplier). Per-account Lambda concurrent-executions quota is called out as the canonical distributed-quota-management problem (concepts/per-account-quotas) — heavy-load tenants self- throttle their account; a central "single pane of glass" quota dashboard is essential.

  • sources/2026-04-21-figma-server-side-sandboxing-virtual-machines — Figma's security team names Lambda as the VM-grade sandbox of choice for stateless, potentially-exploitable workloads: link-preview metadata fetch (FigJam) and canvas image fetch, both of which run third-party URLs + ImageMagick. Architectural shape: Lambda → Firecracker micro-VM, placed outside the production VPC, no IAM permissions into internal services — canonical patterns/minimize-vm-permissions. The post enumerates Lambda-specific gotchas Figma hardened against: the localhost runtime API (SSRF pivot — application code must never reach localhost), over-privileged configuration ("not raw compute" — default-deny IAM + VPC), and the account-level concurrent-execution quota as a shared resource across all Lambdas in the account/region. Disclosed latency reality: first un-warmed call up to 10 seconds; warmed average lower, but still required "direct engineering efforts into ensuring that we were minimizing startup and processing costs as much as possible." Routing-within-tenant property explicitly noted as a Figma-customer constraint: AWS reuses Lambda VMs across requests from the same tenant because Firecracker boot times are still too expensive per synchronous request — Figma has "minimal control over routing at this level," and accepts the trade-off.

  • sources/2026-04-01-aws-automate-safety-monitoring-with-computer-vision-and-generative-ai — Lambda plays multiple stateless roles across the CV-safety pipeline: post-processing + annotation format conversion, model promotion via code-review PR, per-minute risk-aggregation, scheduled risk-resolution + SLA-exhaustion checks, tape-labeling preparation. AWS service-team collaboration required to raise concurrent-execution quotas for thousands of concurrent workers across accounts + per-Lambda memory-allocation + multithreading optimisations + SQS batch-size tuning — explicit reminder that Lambda at hundreds-of-sites scale is not purely declarative, the substrate must be tuned.

  • sources/2026-04-08-aws-build-a-multi-tenant-configuration-system-with-tagged-storage-patterns — Lambda as the invalidator compute in event-driven config refresh. Triggered by EventBridge rule on Parameter Store changes, extracts tenantId from the path, queries Cloud Map for healthy Config Service instances, and makes parallel gRPC refresh RPCs to each — the in-place cache update is what makes "zero downtime" real (no service restarts, no connection drops). Canonical shape for stateless-reactive-invalidator compute.

  • sources/2025-12-11-aws-architecting-conversational-observability-for-cloud-applications — Lambda as telemetry-normalization + embedding-generation compute in the RAG variant of AWS's EKS conversational- observability blueprint. Consumes records from Kinesis Data Streams, normalizes per-source telemetry, and calls Bedrock's Titan Embeddings v2 on batches of records before writing to OpenSearch Serverless. Canonical wiki instance of Lambda in the telemetry-to-RAG ingest tier; AWS's explicit "Pro tip" is that Kinesis-event-source batching is the primary cost lever at this layer.

  • sources/2025-04-08-flyio-our-best-customers-are-now-robotsLambda named as the canonical start-latency-positional comparator for Fly Machines, with the first wiki disclosure that Fly Machines run on Lambda's hypervisor. Canonical wiki quote: "Not coincidentally, our underlying hypervisor engine is the same as Lambda's. […] Like a Lambda invocation, a Fly Machine can start like it's spring-loaded, in double-digit millis. But unlike Lambda, it can stick around as long as you want it to: you can run a server, or a 36-hour batch job, just as easily in a Fly Machine as in an EC2 VM." Fly.io's start vs create lifecycle — the subject of the 2025-04-08 post — borrows Lambda's fast-start but adds a stateful-filesystem stop between invocations, letting Fly Machines stay in stopped state for hours without billing and resume at invocation latency. The Fly-vs-Lambda positioning is Lambda as one axis of the Lambda–EC2 hybrid: Fly Machines are Lambda-like on start latency and EC2-like on runtime duration + state persistence. First wiki confirmation that non-GPU Fly Machines share the Firecracker substrate with Lambda (the 2025-02-14 GPU retrospective disclosed the GPU-Machines-on-Cloud-Hypervisor split; this 2025-04-08 post anchors down non-GPU = Firecracker = Lambda's hypervisor).

  • sources/2026-04-22-allthingsdistributed-invisible-engineering-behind-lambdas-networkcanonical wiki disclosure of Lambda's decade-long networking retrofit — the "invisible engineering" that unified Lambda's traditional + SnapStart topologies onto one worker and scaled snapshot-network density 200 → 4,000 per worker (20×) while the platform kept serving customer traffic at full scale. Names the specific kernel + eBPF techniques that made it work: (1) eBPF-based Geneve header rewrite on egress/ingress (concepts/geneve-tunnel-vni, patterns/ebpf-header-rewrite-on-egress) cut tunnel latency from 150 ms → 200 μs (~750×) — Lambda pre-creates tunnels with dummy VNIs during pooling and rewrites to the real VNI once function init produces it; (2) stateless NAT via eBPF replaced the dual-stage stateful [iptables

  • conntrack](<../concepts/double-nat.md>) at 100× lower setup latency; (3) pre-create all 4,000 networks at worker init (~3 minutes of boot cost) — the canonical instance of Colm MacCárthaigh's constant-work principle; (4) per-slot iptables in namespace compressed root-namespace rules from 125,000+ to 144 static slot-agnostic rules; (5) RTNL-lock-friendly ordering (pool namespaces first, create veth inside namespace, batch eBPF attachments) removed the "seconds → minutes" parallel-create stall. Combined result: fleet-wide −1% CPU usage and full VPC cold-start unlock (Geneve portion; DHCP is still open, "a multi-phase effort the team is currently working through"). Architectural leverage: latency optimization relaxed density + cross-AZ evacuation as side effects. Build/rejected decisions disclosed: custom kernel driver rejected to avoid upstream-patch-maintenance burden; eBPF chosen over DPDK on less-overhead + more-control axis, with Cilium cited as the at-scale eBPF existence proof that de-risked the bet. Productization arc: the full networking stack was encapsulated as a service that Aurora DSQL now consumes — DSQL requests/uses/releases network slots via a Lambda-owned service; Lambda vends new versions and every optimization flows to DSQL automatically (patterns/encapsulate-optimization-as-internal-service). First wiki disclosure that DSQL consumes Lambda's networking substrate as an internal managed service, not as a copy of the stack. Werner's framing: Marc Olson's "converting a propeller aircraft to a jet while it's in flight. One mistake and the plane goes down. But get it right… and no one notices."

  • Lambda as PHP-runtime host for Laravel via Bref, benchmarked against PlanetScale. Canonical wiki demonstration that the persistent-process-within-a-Lambda- execution-context trick (patterns/persistent-process-for-serverless-php-db-connections) recovers ~5× p50 latency without giving up autoscale: classic shares-nothing Bref holds p50 75 ms / p95 130 ms at 3,800 req/min burst from cold start (query cost itself is 0.3 ms — the remainder is TLS handshake + MySQL auth + Laravel bootstrap re-paid on every invocation). Switching to Laravel Octane with OCTANE_PERSIST_DATABASE_SESSIONS=1 + BREF_LOOP_MAX=250 pulls p50 to 14 ms (5.4×) and p95 to 35 ms (3.7×) by keeping the PHP process + DB connection warm across up to 250 invocations per Lambda execution context. Also canonicalises Lambda + Bref cold-start tail datum: ~1 s cold starts, but <1% of requests in the first minute under 50-concurrent fan-out (first ~50 invocations out of 3,800+). Matthieu Napoli (Bref creator), 2023-05-03. First wiki framing of Lambda as a host for non-Node, non-Python, non-Java language runtimes via third-party community runtimes, and first wiki instance of the persistent-request- handler-inside-serverless-invocation pattern that generalises across languages.

  • sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai — Lambda as the MSK → AgentCore integration shim in IBM + AWS's KYC architecture. "Lambda functions serve as the integration layer, consuming events from MSK, invoking AgentCore asynchronously, and publishing results back to Kafka topics for downstream system consumption." Also the implementation of Action Groups' OpenAPI-declared tool targets behind AgentCore Gateway. Canonical AgentCore-integration shape; see patterns/async-agent-invocation-over-kafka.

Last updated · 542 distilled / 1,571 read