Skip to content

SYSTEM Cited by 1 source

Meta BLOB Storage

Definition

Meta BLOB Storage is Meta's global, infinitely scalable object-storage service built on top of Tectonic. It exposes policies that let users make trade-offs between durability and availability, and serves all of Meta's external and internal products requiring object storage.

Architecture (Post-2026 Re-architecture)

The 2026 re-architecture for AI workloads introduced three major changes:

  1. Unified metadata schema — collapsed the multi-layer metadata (namelayer, volumeslayer, containerlayer) into a single flat schema on ZippyDB, enabling O(1) path-to-address resolution
  2. Fat client SDK (no dataplane proxy) — eliminated the intermediate proxy; the SDK embeds a Tectonic BlockClient and streams data directly from storage servers
  3. Regional deployment — lean stack deployed as a regional service colocated with GPUs in every AI region, though also deployable as a global service

Request Flow (New)

Client SDK
  → getReadPlan("/bucket/path") to API server
  → O(1) metadata lookup in ZippyDB → (blockId, offset, size) tuples
  → SDK uses embedded BlockClient → streams directly from Tectonic

Spike Handling

  • Distributed data cache on GPU host spare memory (Owl peers integrated into SDK) — 80% cache hit rate
  • ReadPlan metadata cache in distributed memory (memcache-like) — 1–2 ms access

Protocol Optimizations

  • Hedged reads for tail-latency mitigation
  • Dynamic concurrency control for egress spike management during checkpoint events

Research Velocity Mode

A tiered-cache architecture with on-demand hydration: - L1: GPU host memory, L2: GPU host flash, L3: regional BLOB-storage flash - Global HDD-backed BLOB storage as source of truth - Dataloader prefetch + deep prefetch API + automatic lifecycle (TTL/LRU) - Eliminated hours of data ingestion; researchers ingest once, access anywhere

(Source: sources/2026-07-01-meta-ai-storage-blueprint-at-scale)

Seen in

Last updated · 567 distilled / 1,685 read