SYSTEM Cited by 1 source
Expedia generative-AI proxy¶
Definition¶
Expedia's internal generative-AI proxy is a company-level LLM choke point: a central service that brokers LLM invocations for internal consumers, handles authentication + authorization, and exposes multiple models that the platform team "constantly evaluate[s] for quality of results, cost, and performance implications" (Source: sources/2026-04-28-expedia-expedias-service-telemetry-analyzer).
First Seen-in on the wiki: it is the model-layer substrate for STAR.
Load-bearing properties¶
- Authn / authz. Consumer services (like STAR) do not hold third-party LLM credentials; the proxy enforces Expedia identity and access control on every call.
- Model abstraction. The proxy exposes multiple models; callers choose among them or accept proxy defaults. Expedia is "also exploring using different models for the various tasks in STAR" — the proxy makes per-step model selection a configuration change, not an integration rewrite.
- Rate limiting. The proxy itself rate-limits; STAR accommodates this with "common resiliency patterns ... asynchronous operations and batch processing".
- Centralised evaluation surface. Quality / cost / perf are measured at the proxy tier, not per application — the proxy is the natural place to run A/B tests of new models before rolling them out.
Consumers¶
- STAR — Expedia's Service Telemetry Analyzer. Uses the proxy for every prompt in its multi-step RCA workflow.
Stub page¶
This page is a stub built from STAR's description. Architecture details (deployment shape, queueing / batching model, pricing attribution, model-eligibility governance, SSO integration) are not disclosed in the STAR post.
Seen in¶
- sources/2026-04-28-expedia-expedias-service-telemetry-analyzer — STAR's named upstream; handles authn/authz, multi-model access, and is rate-limited.
Related¶
- systems/expedia-star — canonical wiki consumer.
- concepts/token-heavy-system — the class of workload STAR fits in; the proxy is the natural vantage point for measuring token spend.
- concepts/prompt-chaining — STAR's orchestration technique, issued as N proxy calls per workflow.
- systems/langfuse — evaluation + tracing substrate upstream of the proxy.
- companies/expedia