CONCEPT Cited by 1 source

Cost tracking per team¶

Definition¶

Cost tracking per team is the platform-engineering discipline of attributing every inference call (and its dollar cost) back to the consuming internal team, enabling chargeback accounting, budget allocation, capacity planning, and runaway-cost detection. For LLM platforms specifically, it's typically implemented at the AI-Gateway layer — every LLM call passes through a single proxy that logs team + job + model + token count + cost.

Why per-team, not per-call¶

Per-call metrics are necessary but not sufficient. Per-team attribution is what lets a platform org answer questions like:

Which team is driving our 10× month-over-month cost growth?
Is Catalog's new extraction pipeline eating the quarterly LLM budget?
What's the cost-per-customer-transaction for each LLM-powered feature?
Are we giving each team a rate limit commensurate with their budget?

Without per-team tags, the answer is always "let me grep through logs for three days." With per-team tags, it's a dashboard.

Canonical wiki instance¶

Instacart's Cost Tracker integrates with the AI Gateway; every LLM call from any internal platform (including Maple's batch jobs) is logged with per- team attribution. Maple's own feature list names "tracks detailed cost usage for each team" as one of its outputs — the tracking surface is exposed upward from the Gateway-layer facility. (Source: sources/2025-08-27-instacart-simplifying-large-scale-llm-processing-with-maple)

Sibling instances on the wiki:

systems/cloudflare-ai-gateway — external equivalent; same per-team / per-application attribution.
systems/unity-ai-gateway (Databricks) — three-pillar framing (audit + single-bill + Lakehouse observability) explicitly names cost tracking alongside audit as the governance pillars.

Place in the governance stack¶

Per-team cost tracking is one leg of concepts/centralized-ai-governance alongside audit, policy enforcement, and rate limiting. All four concerns tend to co-locate at the AI Gateway layer because they all require every LLM call to pass through one point — the gateway is the only place in a system topology where that's true by construction.

Generalisation¶

The pattern is reusable anywhere teams share a metered external resource:

Cloud provider costs — AWS Cost & Usage Reports with team tagging.
Database queries — per-application query accounting on a shared query gateway.
Third-party API calls — per-team attribution on a shared API proxy.

The operational preconditions:

Single choke point — all calls pass through one tier.
Team identity in requests — caller identity either authenticated or enforced by the gateway.
Attribution storage — durable record with enough cardinality to query by team × time × model × job.

Seen in¶

sources/2025-08-27-instacart-simplifying-large-scale-llm-processing-with-maple — canonical wiki disclosure. Instacart's Cost Tracker integrated into the AI Gateway; per-team cost attribution exposed upward as a Maple feature.

patterns/chargeback-cost-attribution — the general pattern.
patterns/ai-gateway-provider-abstraction — the architectural tier that hosts the tracking.
concepts/centralized-ai-governance — the umbrella framing.
systems/instacart-cost-tracker — Instacart's specific system.
systems/instacart-ai-gateway — the tier it lives in.
systems/maple-instacart — upstream consumer.
systems/cloudflare-ai-gateway — external sibling.
systems/unity-ai-gateway — Databricks sibling.