CONCEPT Cited by 1 source
Cost tracking per team¶
Definition¶
Cost tracking per team is the platform-engineering discipline of attributing every inference call (and its dollar cost) back to the consuming internal team, enabling chargeback accounting, budget allocation, capacity planning, and runaway-cost detection. For LLM platforms specifically, it's typically implemented at the AI-Gateway layer — every LLM call passes through a single proxy that logs team + job + model + token count + cost.
Why per-team, not per-call¶
Per-call metrics are necessary but not sufficient. Per-team attribution is what lets a platform org answer questions like:
- Which team is driving our 10× month-over-month cost growth?
- Is Catalog's new extraction pipeline eating the quarterly LLM budget?
- What's the cost-per-customer-transaction for each LLM-powered feature?
- Are we giving each team a rate limit commensurate with their budget?
Without per-team tags, the answer is always "let me grep through logs for three days." With per-team tags, it's a dashboard.
Canonical wiki instance¶
Instacart's Cost Tracker integrates with the AI Gateway; every LLM call from any internal platform (including Maple's batch jobs) is logged with per- team attribution. Maple's own feature list names "tracks detailed cost usage for each team" as one of its outputs — the tracking surface is exposed upward from the Gateway-layer facility. (Source: sources/2025-08-27-instacart-simplifying-large-scale-llm-processing-with-maple)
Sibling instances on the wiki:
- systems/cloudflare-ai-gateway — external equivalent; same per-team / per-application attribution.
- systems/unity-ai-gateway (Databricks) — three-pillar framing (audit + single-bill + Lakehouse observability) explicitly names cost tracking alongside audit as the governance pillars.
Place in the governance stack¶
Per-team cost tracking is one leg of concepts/centralized-ai-governance alongside audit, policy enforcement, and rate limiting. All four concerns tend to co-locate at the AI Gateway layer because they all require every LLM call to pass through one point — the gateway is the only place in a system topology where that's true by construction.
Generalisation¶
The pattern is reusable anywhere teams share a metered external resource:
- Cloud provider costs — AWS Cost & Usage Reports with team tagging.
- Database queries — per-application query accounting on a shared query gateway.
- Third-party API calls — per-team attribution on a shared API proxy.
The operational preconditions:
- Single choke point — all calls pass through one tier.
- Team identity in requests — caller identity either authenticated or enforced by the gateway.
- Attribution storage — durable record with enough cardinality to query by team × time × model × job.
Seen in¶
- sources/2025-08-27-instacart-simplifying-large-scale-llm-processing-with-maple — canonical wiki disclosure. Instacart's Cost Tracker integrated into the AI Gateway; per-team cost attribution exposed upward as a Maple feature.
Related¶
- patterns/chargeback-cost-attribution — the general pattern.
- patterns/ai-gateway-provider-abstraction — the architectural tier that hosts the tracking.
- concepts/centralized-ai-governance — the umbrella framing.
- systems/instacart-cost-tracker — Instacart's specific system.
- systems/instacart-ai-gateway — the tier it lives in.
- systems/maple-instacart — upstream consumer.
- systems/cloudflare-ai-gateway — external sibling.
- systems/unity-ai-gateway — Databricks sibling.