PATTERN Cited by 1 source
SRE team per Product Cluster¶
SRE team per Product Cluster is the organisational shape that positions one SRE team at the granularity of a Product Cluster (a grouping of 5–20 delivery teams working on a related product domain) — rather than one central SRE team for the whole company, or one embedded SRE per delivery team.
The three alternatives¶
Zalando's 2016 retrospective names the three options it debated:
- One central SRE team — rejected. Zalando was already 1,000+ engineers; no central team could cover the surface. Fails at scale for any org past a few hundred engineers.
- One SRE per delivery team (embed) — rejected. "The scope would be too large for the lone SREs. Not to mention that, over time, they'd likely become the Ops engineer for the team they were in." (Source: sources/2021-09-12-zalando-tracing-sres-journey-in-zalando-part-i). Fails because lone SREs regress to the mean of their host team.
- One SRE team per Product Cluster — chosen. Gives SREs end-to-end responsibility over a domain without too-wide scope.
Why the middle ground¶
The Product Cluster granularity is the point at which:
- SREs have enough context to be effective (one domain, not the whole company).
- SREs are numerous enough (a team, not a lone embed) to avoid absorption into a single delivery team's ops work.
- The number of SRE teams scales with product growth rather than company headcount.
- Cross-cluster patterns emerge that a dedicated SRE department can later own (see Phase 3 of concepts/sre-organizational-evolution).
Reporting chain¶
The Google SRE workbook's guidance that reliability work is a specialised role pairs with this pattern — each Product Cluster SRE team reports into the SRE chain, not into product delivery, to avoid the lone-SRE regression-to-Ops failure mode.
When to use¶
- Org is large enough that a central SRE team cannot cover the surface (≥ a few hundred engineers).
- Product domains are cohesive enough that "one team per cluster" is a meaningful granularity.
- There's organisational appetite to staff multiple SRE teams (not a single Head of SRE + contractors).
Seen in¶
- sources/2021-09-12-zalando-tracing-sres-journey-in-zalando-part-i — canonical debate and choice narrated directly.