CONCEPT Cited by 1 source
In-place vs new-DC database upgrade¶
Definition¶
When upgrading a distributed datastore to a new major version, two architectural shapes are available:
- In-place rolling upgrade — the same cluster is upgraded node-by-node, one version to the next, via rolling upgrade. No new hardware is provisioned; the cluster passes through a mixed-version state during the window.
- New-DC upgrade — a fresh data center is provisioned on the old version, all data is streamed to it, its nodes are upgraded to the new version, and production traffic is redirected to the new DC. The old DC is then decommissioned.
For Cassandra specifically, DataStax's upgrade guide recommends rolling restart (in-place) over new-DC.
Trade-offs¶
| Axis | In-place | New-DC |
|---|---|---|
| Hardware cost during upgrade | 1× fleet | ~2× fleet (for duration) |
| Time | Hours–days (per cluster) | Weeks (streaming) |
| EBS / volume right-sizing | No, stuck with current | Yes, fresh provisioning |
| Config standardisation | Limited | Yes — all nodes provisioned to the new standard |
| Rollback shape | Reverse rolling upgrade | Redirect traffic back to old DC |
| Consistency during transition | EACH_QUORUM maintained |
Downgraded during streaming — eventual-consistency window |
| Simultaneous version | Mixed in one cluster | Two separate clusters |
| Ops surface | One cluster in mixed state | Two clusters + streaming pipeline + traffic switch |
Why Yelp chose in-place¶
Yelp considered new-DC for EBS right-sizing + DC-specific config standardisation + easier rollback-via-traffic-redirect — and rejected it on three load-bearing grounds:
- Time: "would have involved streaming all data to the new DC and could have taken weeks to complete the upgrade."
- Consistency: "we would have had to account for eventual
consistency due to downgrading from the
EACH_QUORUMconsistency level" during dual-DC operation. - Cost: "Running twice the number of nodes per DC would also have significantly increased costs."
"We opted for an in-place upgrade to reduce time and cost."
When new-DC is still the right answer¶
In-place is Yelp's answer at their fleet size (> 1,000 nodes, the cost multiplier is substantial), but new-DC is still the right answer when:
- The fleet is small enough that 2× cost is manageable.
- There is a real EBS right-sizing or config-standardisation win that an in-place upgrade can't deliver.
- The cluster is small enough to stream in a day rather than weeks.
- Regulatory or SLA constraints make the mixed-version state risky in ways that separate physical clusters avoid.
- The rollback story needs to be atomic traffic switch rather than reverse-rolling-upgrade.
Contrast with blue/green¶
New-DC upgrade is the database-tier blue/green — blue/green deployment applied to the datastore. The comparison table on rolling upgrade applies transitively.
Seen in¶
- sources/2026-04-07-yelp-zero-downtime-cassandra-4x-upgrade — canonical wiki Seen-in. Yelp's explicit reasoning for choosing in-place over new-DC for a > 1,000-node Cassandra fleet upgrade from 3.11 to 4.1.
Related¶
- concepts/rolling-upgrade — the in-place upgrade idiom.
- concepts/mixed-version-cluster — the cluster state in-place traverses.
- concepts/blue-green-deployment — the architectural analogue of new-DC upgrade at the datastore tier.
- concepts/cross-region-bandwidth-cost — the stream-the- data cost axis that makes new-DC expensive at scale.
- systems/apache-cassandra — canonical datastore for this trade-off.