SYSTEM Cited by 8 sources
AWS Lambda¶
AWS Lambda is AWS's function-as-a-service compute platform, launched Nov 2014. Runs customer code in stateless invocations on AWS-managed infrastructure, scaling from one request per month to thousands per second with no capacity planning and no charge for idle time. This page distils the architectural commitments Lambda made at launch and which of them held up.
Design tenets (from the 2014 PR/FAQ)¶
- Security without complexity — sandboxed execution; up-to-date patches without user action.
- Simple and easy — "NoOps"; few sensible defaults; self-serve.
- Scales up and down (to zero) — same code path from 1 req/month to 1,000 req/s. See concepts/scale-to-zero.
- Cost effective at any scale — fine-grained pay-per-use; no idle charge. See concepts/fine-grained-billing.
- AWS integration — easy to call other AWS services from a function; make other services better by being an easy backing execution engine.
- Reliable — public latency/availability targets (99.99% at launch), higher internal bar.
(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)
Core model¶
- Stateless functions. Persistent state lives in S3 / DynamoDB / etc. Local filesystem is scratch, deleted between invocations. See concepts/stateless-compute.
- Event-driven invocation. Triggered by AWS events (S3 PUT/COPY,
DynamoDB updates/streams, SNS, SQS), HTTP requests, SDK API calls,
CLI, or scheduled cron. In the 2014 doc these were specific APIs
(
SetUpdateHandler,SetTableHandler); they shipped as unified event-source integrations. - Placement engine as a compaction strategy. Per-account, the scheduler minimises the number of EC2 instances needed for that account's workload while holding latency/throughput/availability targets — bursty, heterogeneous, and short-lived workloads all pack efficiently.
- Sandbox isolation. At launch: single-tenant EC2 instances (no two customers on the same instance — expensive but non-negotiable for security). Today: Firecracker micro-VMs with many tenants per bare metal host. See systems/firecracker, concepts/micro-vm-isolation.
Billing & limits evolution¶
| Dimension | 2014 PR/FAQ → launch | Today (2024) |
|---|---|---|
| Billing granularity | 250ms (doc) → 100ms (launch) | 1ms, no minimum |
| Memory per function | Up to 1 GB (doc) | Up to 10 GB |
| Package size | ZIP only | ZIP or container image up to 10 GB (2020) |
| Runtimes | Node only at launch (by design) | Java, Python, .NET, Go, Ruby, Node, + Custom Runtimes + Layers (2018) |
| Cold-start mitigation | "higher first time / when not used recently" | SnapStart (2022) — up to 90% cold-start reduction for Java, built on Firecracker snapshots |
(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)
Deliberate launch scope choices¶
- Node.js only at GA to validate the programming model and observe how customers actually used the service; other runtimes followed over 4 years of iteration. See patterns/launch-minimal-runtime.
- ZIP-only packaging as a radical simplification vs. uploading files individually. Container images were deferred to 2020 once the on-demand loading problem was solved.
- Native library support from day one, anticipating that dependency chains would pull in C extensions even if the user didn't write any — later generalised to Lambda Layers (2018).
Operational commitments¶
- Latency: published from a canary EC2 client invoking an echo function; internal measurement includes server-side latency, code caching efficacy, invoke-to-customer-code latency.
- Throughput: per-host paging rate, CPU utilisation, network bandwidth; sustained high = add EC2 capacity or shift the fleet mix.
- Availability: control-plane availability reported like other AWS services; per-application invoke-plane availability reported via CloudWatch.
(Source: sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years)
Seen in¶
- sources/2024-11-15-allthingsdistributed-aws-lambda-prfaq-after-10-years — the internal PR/FAQ that launched Lambda, with 10-year retrospective annotations.
- sources/2026-01-30-aws-sovereign-failover-design-digital-sovereignty — named as the rotation-engine substrate for Secrets Manager-stored IAM-user credentials in the cross-partition auth IAM-user fallback pattern.
- sources/2026-04-21-figma-enforcing-device-trust-on-code-changes — Figma runs its commit-signature verifier as a stateless Lambda behind a Function URL webhook endpoint from GitHub; canonical instance of patterns/webhook-triggered-verifier-lambda backing systems/figma-commit-signature-verification. The Lambda loads GitHub-App credentials from Secrets Manager and posts a required commit status check gating release-branch merges.
- sources/2026-02-05-aws-convera-verified-permissions-fine-grained-authorization — Lambda serves two stateless roles on Convera's authorization hot path: (1) the Lambda authorizer in front of API Gateway that validates JWTs and calls AVP for fine-grained policy evaluation; (2) the pre-token-generation hook that enriches the Cognito access token with authorization-relevant attributes (role, tenant_id, etc.) from RDS / DynamoDB at login time. Same shape reused across customer, internal Okta-federated, machine-to-machine, and multi-tenant flows.
-
sources/2026-02-25-aws-6000-accounts-three-people-one-platform — the most extreme production instance of Lambda's scale-to- zero + per-invocation billing tenets paying off: ProGlove runs ~1,000,000 Lambda functions in production across ~6,000 tenant accounts (full account-per-tenant architecture). The function-count is economically viable only because idle functions cost nothing — the post explicitly cites Lambda as the exemplar scale-to-zero service that makes account- per-tenant feasible, contrasting with provisioned-per-resource services like EC2 ($3/mo → $3,000/mo at 1,000-account multiplier). Per-account Lambda concurrent-executions quota is called out as the canonical distributed-quota-management problem (concepts/per-account-quotas) — heavy-load tenants self- throttle their account; a central "single pane of glass" quota dashboard is essential.
-
sources/2026-04-21-figma-server-side-sandboxing-virtual-machines — Figma's security team names Lambda as the VM-grade sandbox of choice for stateless, potentially-exploitable workloads: link-preview metadata fetch (FigJam) and canvas image fetch, both of which run third-party URLs + ImageMagick. Architectural shape: Lambda → Firecracker micro-VM, placed outside the production VPC, no IAM permissions into internal services — canonical patterns/minimize-vm-permissions. The post enumerates Lambda-specific gotchas Figma hardened against: the localhost runtime API (SSRF pivot — application code must never reach
localhost), over-privileged configuration ("not raw compute" — default-deny IAM + VPC), and the account-level concurrent-execution quota as a shared resource across all Lambdas in the account/region. Disclosed latency reality: first un-warmed call up to 10 seconds; warmed average lower, but still required "direct engineering efforts into ensuring that we were minimizing startup and processing costs as much as possible." Routing-within-tenant property explicitly noted as a Figma-customer constraint: AWS reuses Lambda VMs across requests from the same tenant because Firecracker boot times are still too expensive per synchronous request — Figma has "minimal control over routing at this level," and accepts the trade-off. -
sources/2026-04-01-aws-automate-safety-monitoring-with-computer-vision-and-generative-ai — Lambda plays multiple stateless roles across the CV-safety pipeline: post-processing + annotation format conversion, model promotion via code-review PR, per-minute risk-aggregation, scheduled risk-resolution + SLA-exhaustion checks, tape-labeling preparation. AWS service-team collaboration required to raise concurrent-execution quotas for thousands of concurrent workers across accounts + per-Lambda memory-allocation + multithreading optimisations + SQS batch-size tuning — explicit reminder that Lambda at hundreds-of-sites scale is not purely declarative, the substrate must be tuned.
-
sources/2026-04-08-aws-build-a-multi-tenant-configuration-system-with-tagged-storage-patterns — Lambda as the invalidator compute in event-driven config refresh. Triggered by EventBridge rule on Parameter Store changes, extracts
tenantIdfrom the path, queries Cloud Map for healthy Config Service instances, and makes parallel gRPC refresh RPCs to each — the in-place cache update is what makes "zero downtime" real (no service restarts, no connection drops). Canonical shape for stateless-reactive-invalidator compute. -
sources/2025-12-11-aws-architecting-conversational-observability-for-cloud-applications — Lambda as telemetry-normalization + embedding-generation compute in the RAG variant of AWS's EKS conversational- observability blueprint. Consumes records from Kinesis Data Streams, normalizes per-source telemetry, and calls Bedrock's Titan Embeddings v2 on batches of records before writing to OpenSearch Serverless. Canonical wiki instance of Lambda in the telemetry-to-RAG ingest tier; AWS's explicit "Pro tip" is that Kinesis-event-source batching is the primary cost lever at this layer.
-
sources/2025-04-08-flyio-our-best-customers-are-now-robots — Lambda named as the canonical start-latency-positional comparator for Fly Machines, with the first wiki disclosure that Fly Machines run on Lambda's hypervisor. Canonical wiki quote: "Not coincidentally, our underlying hypervisor engine is the same as Lambda's. […] Like a Lambda invocation, a Fly Machine can start like it's spring-loaded, in double-digit millis. But unlike Lambda, it can stick around as long as you want it to: you can run a server, or a 36-hour batch job, just as easily in a Fly Machine as in an EC2 VM." Fly.io's
startvscreatelifecycle — the subject of the 2025-04-08 post — borrows Lambda's fast-start but adds a stateful-filesystemstopbetween invocations, letting Fly Machines stay in stopped state for hours without billing and resume at invocation latency. The Fly-vs-Lambda positioning is Lambda as one axis of the Lambda–EC2 hybrid: Fly Machines are Lambda-like on start latency and EC2-like on runtime duration + state persistence. First wiki confirmation that non-GPU Fly Machines share the Firecracker substrate with Lambda (the 2025-02-14 GPU retrospective disclosed the GPU-Machines-on-Cloud-Hypervisor split; this 2025-04-08 post anchors down non-GPU = Firecracker = Lambda's hypervisor).