Skip to content

CONCEPT Cited by 1 source

LLM provider commitment lock-in

Definition

LLM provider commitment lock-in is the model-upgrade friction introduced by multi-month dedicated-capacity contracts on managed LLM serving substrates (Amazon Bedrock Provisioned Throughput is the canonical example). Customers who reserve capacity for a model version become economically committed to that version through the contract term; superior models released during the term sit unused until the existing contract expires.

The wiki canonical framing comes from Slack's 2026-05-28 multi-cloud retrospective (Source: sources/2026-05-28-slack-slack-ai-the-path-to-multi-cloud):

"The Commitment Lock-in: Provisioned Throughput often required commitments of one to six months. In the fast-moving world of LLMs, where a state-of-the-art model can be superseded in weeks, these commitments effectively slowed down our ability to upgrade. Even when a superior model was released, we often chose to wait for our existing commitments to expire before migrating."

Why it matters

Three load-bearing properties combine into the lock-in:

  1. Model upgrade pace exceeds commitment term. Slack's verbatim "a state-of-the-art model can be superseded in weeks" means the optimal model at the start of a 6-month contract is rarely still the optimal model at the end.
  2. Sunk-cost economic bias. Even when a clearly-better model exists, the contractually-paid-for capacity makes migration look expensive — "we often chose to wait for our existing commitments to expire before migrating."
  3. Capacity is per-model, not pooled. PT contracts on Bedrock are bound to a specific model SKU. Switching models doesn't re-use the existing PT capacity automatically.
Concept Relationship
concepts/llm-model-feature-lag Feature lag delays availability on the secondary substrate. Commitment lock-in delays upgrades on the primary substrate. Different time-axes.
concepts/llm-over-provisioning-cycle Sibling PT failure mode: over-provisioning is the off-peak cost waste; commitment lock-in is the upgrade-cadence cost.
concepts/vendor-lock-in (general) LLM-serving-specialised case of the broader vendor-lock-in concept; specifically the sub-pattern where contract terms slow internal-product evolution.
concepts/provisioned-throughput-vs-on-demand-llm The PT/OD axis where commitment lock-in is the PT-side cost.
concepts/multi-cloud-llm-serving The cross-cloud architectural resolution that breaks the lock-in path.

Why On-Demand resolves it

OD pricing is per-token / per-request with no commitment. Migration to a superior model is a routing-layer change — zero contract reconciliation, zero stranded capacity, zero sunk-cost bias. Slack's Phase 3 verbatim:

"more importantly, it removed the technical bottleneck we faced in Phase 2: because we were no longer locked into multi-month commitments, we regained the freedom to migrate features to different models. This meant that as soon as a more performant model dropped and passed our internal quality and metrics bars, we could pivot our infrastructure to support it within a day, rather than waiting months for a contract to expire."

The disclosed migration-cadence improvement: months → 1 day for a model swap when no PT commitment is in the way.

Trade-offs

OD breaks the lock-in but introduces its own failure modes (see concepts/provisioned-throughput-vs-on-demand-llm):

  • Service-level variability (shared-resource model).
  • Regional capacity orchestration dependency on the provider.
  • Concentration risk — heavy OD usage on one provider creates a different reliability problem.

The hybrid resolution (patterns/provisioned-throughput-with-on-demand-spillover) balances commitment-lock-in mitigation (OD for bursty workloads that change models often) against the latency consistency needs (PT for stable workloads that don't change models often).

Quantitative signature (disclosed)

Property Value
PT contract term (Bedrock) 1–6 months
Model state-of-the-art turnover "weeks"
Migration-cadence with PT months (wait for contract expiry)
Migration-cadence without PT 1 day (per Slack Phase 3)

Seen in

Last updated · 542 distilled / 1,571 read