CONCEPT Cited by 1 source
Central-first sharded architecture¶
Central-first sharded architecture pairs a global coordinator with regional shards: the global presents a single interface to clients (engineers, agents, APIs), while the shards keep sensitive or regulated data local. Requests are routed through the central layer but fan out to the region that actually owns the data, and responses may carry only non-sensitive projections back.
This is the topology that lets a centralized agent / control plane reason across a multi-region / multi-cloud / multi-regulatory fleet without aggregating data out of its jurisdiction.
Why not fully centralized vs fully decentralized¶
- Fully centralized (replicate everything into one store) violates data-residency / regulatory constraints and risks a compliance failure per regulated domain.
- Fully decentralized (no global view) forces every caller to know the topology — for an AI agent, that means region-specific logic, per-cloud code paths, and context fragmentation. Iteration loops slow to a crawl because the agent has nowhere consistent to run.
- Central-first sharded gives callers one entrypoint and one authorization model, while keeping the data local to where it's legal and where latency is best.
Requirements¶
- Uniform abstractions at the global layer. Regional API differences are hidden — the agent doesn't write per-region code paths.
- Fine-grained access control. Enforced at team / resource / RPC levels, uniformly across humans and automated callers. Without this, the global front becomes either over-permissive (unsafe) or over-restrictive (useless). See concepts/least-privileged-access.
- Routing with locality-aware policy. Requests land in the correct shard; cross-shard fanout is explicit and observable.
- Consistent telemetry back. Metrics + logs visible globally for debugging, even when underlying data stays sharded.
Contrast¶
- Federated identity / federated query systems (e.g., Presto across clouds) solve a similar problem but tend to assume a read-only cross-region pattern without the write-side governance centralization.
- Classic control-plane / data-plane split (concepts/control-plane-data-plane-separation) is the functional split (decide vs. deliver). Central-first sharded is the topological split (coordinator vs. regional shards) and typically composes with it — the coordinator is itself a control-plane tier.
Seen in¶
- sources/2025-12-03-databricks-ai-agent-debug-databases — Databricks' systems/storex uses central-first sharded as the AI-integration precondition: without one interface + one auth model + one abstraction layer, an agent reasoning across "thousands of databases, hundreds of regions, three clouds, eight regulatory domains" would face context fragmentation, governance ambiguity, and slow iteration.