Skip to content

CONCEPT Cited by 3 sources

Monolith vs microservices pendulum

Definition

The observed oscillation of engineering orgs between monolith and microservices architectures as they grow, stall, or reorganize — with neither pole being a destination. A team that decomposed a monolith into services often finds itself years later introducing "macroservices" (fewer, bigger services) or a unified GraphQL data-access layer that effectively re-consolidates the logical API surface.

2026 Meta SilverTorch datapoint — pendulum applied to recsys retrieval

The wiki's first canonical instance of the pendulum applied inside an ML pipeline, not at the service-API layer (Source: sources/2026-05-26-meta-silvertorch-index-as-model-a-new-retrieval-paradigm-for-recommendation-systems):

  • Pre-SilverTorch: Recsys retrieval as a microservice mesh — orchestrator + user-tower + ANN + filter + scoring services, each in a different codebase, often a different language, with its own deployment lifecycle. Per-service optimisations like Faiss-GPU help individual hops but "don't resolve the underlying structural limits."
  • SilverTorch (Index as Model): All five components collapse into a single PyTorch model. "One artifact to deploy, one forward pass to run and one source of truth for what's in the system." The ANN search, eligibility filter, scoring layer, and user tower all become nn.Module regions of one neural network.

Outcome: 23.7× more requests per second + 20.9× TCO efficiency on the same model architecture, plus capabilities (neural reranking, multi-task scoring inside retrieval) that the prior microservice mesh structurally could not run inside the sub-100 ms retrieval budget.

Structural argument for why the pendulum swings to monolith here: the gains require cross-module GPU co-design"pick the most promising clusters first, filter only inside those clusters, then score only the survivors" — and "this level of co-design requires modules to share memory, an execution graph, and a compilation step." Three failure modes of the prior mesh that drove the swing (the version-skew failure mode being the most architecturally distinctive):

  • Latency lost to data movement between services.
  • Version inconsistency across independently-deployed user-tower / item-index / filter-rules artifacts.
  • Siloed ML / infra development translating ideas between PyTorch and serving-C++.

Generalisation: The pendulum swings toward monolith when cross-module co-design wins dominate independent-service deployability wins. In recsys retrieval-on-GPU, that condition holds because GPU hardware rewards dense parallel work + fused kernels + shared memory. In other contexts (independent feature pipelines feeding unrelated downstream consumers, multi-team product domains with genuinely independent lifecycles), the condition fails and the pendulum stays at services. The patterns/unified-pytorch-model-as-retrieval-system pattern captures the structural conditions for the recsys-retrieval instance.

This is distinct in altitude from the canonical wiki instances (Airbnb macroservices, Uber Project Ark, Stack Overflow's enduring monolith, Wave's Python-monolith-on-Postgres) — those re-consolidate the API surface across many domains; SilverTorch re-consolidates a single ML pipeline's services into one model graph. Same pendulum, different scope.

Canonical 2022 datapoints

  • Airbnb (Source: sources/2022-07-11-highscalability-stuff-the-internet-says-on-scalability-for-july-11th-2022):
    • 2008–2017 monolith + monorepo, ~$2.6B revenue.
    • 2017–2020 microservices (split by dedicated migration team).
    • 2020–present "micro + macroservices": unified APIs, central data aggregator, service-block facade APIs. Thrift → GraphQL as the unified data access layer.
  • Shopify — still runs a Ruby monolith, decomposing internally for developer productivity rather than splitting into services (Shopify engineering).
  • Stack Overflow — 14-year .NET monolith, 1.3B views/month on 9 on-prem servers; removed fragment cache years ago with no measurable latency impact (systems/stack-overflow-architecture).
  • Wave — $1.7B company, 70 engineers, "a Python monolith on top of Postgres."
  • Steven Lemon"rather than separate our monolith into separate services, we started to break our solution into separate projects within the existing monolith."

2024 Uber datapoint — the 2011→2014→2018→2020 pendulum

The textbook real-world instance on this wiki. Per Josh Clemm's 2024-03-14 retrospective (sources/2024-03-14-highscalability-brief-history-of-scaling-uber):

  • 2009 → 2011: LAMP monolith → PHP/MySQL.
  • 2011: two-monolith — dispatch (Node.js) + API (Python).
  • 2013–14: SOA decomposition of the API monolith into ~100 Python/Tornado microservices, with the whole service-oriented architecture platform built to support the split (TChannel, Hyperbahn, Thrift, Clay, Flipr, M3, Jaeger).
  • 2018: The pendulum swings back toward consolidation. Uber's 2018 Project Ark response to the distributed monolith of "thousands of microservices, 12,000 repos, 5-6 systems doing 75% the same thing" — language consolidation to Java + Go, 12,000 repos → 5 per-language monorepos.
  • 2020: Partial re-swing toward structured services with the four-layer Edge Gateway — explicit Edge / Presentation / Product / Domain service tiers. Not a return to monolith; a right-sized-services structure with clear contracts.
  • 2021+: Fulfillment Platform rewrite is a re-consolidation of the matching stack into a single NewSQL-backed platform — explicitly because the 2014-era micro-service sprawl couldn't cleanly support new flow types (reservations, batching, airport queues).

The Uber arc shows the pendulum doesn't swing just twice — over 15 years, it oscillates continuously, each swing at a different level of the stack.

Core tension

Microservices sell velocity + team independence and buy distributed-systems tax — network hops, partial failure, cross-cutting feature complexity, org-chart coupling. Monoliths sell simplicity, low latency, single-deploy and buy team contention and deploy-granularity coupling. Neither tool is universally right; the pendulum swings because the trade-offs are real in both directions.

The characteristic failure mode in either direction: "you weren't doing it right" — a self-fulfilling prophecy that prevents honest retrospective. (Source: sources/2022-07-11-highscalability-stuff-the-internet-says-on-scalability-for-july-11th-2022).

Heuristics from practitioners

  • jedberg (HN): "every startup I advise I tell them don't do microservices at the start. Build your monolith with clean hard edges between modules and functions so that it will be easier later, but build a monolith until you get big enough that microservices is actually a win."
  • @hkarthik: "The longer an engineer works in a monolith, the more they yearn to decompose it into micro-services. The longer an engineer works across a ton of micro-services, the more they yearn for the days of working in a monolith."
  • @jpetazzo (OH): "we have over 170 microservices, because our principal engineer is very knowledgeable about distributed systems."

Key insight

Right-sizing services is a continuous problem, not a destination. A monolith can be properly service-based on the inside without being distributed on the outside. The count of repos means nothing; what matters is internal code organization and API boundaries.

Last updated · 542 distilled / 1,571 read