SYSTEM Cited by 1 source

Expedia generative-AI proxy¶

Definition¶

Expedia's internal generative-AI proxy is a company-level LLM choke point: a central service that brokers LLM invocations for internal consumers, handles authentication + authorization, and exposes multiple models that the platform team "constantly evaluate[s] for quality of results, cost, and performance implications" (Source: sources/2026-04-28-expedia-expedias-service-telemetry-analyzer).

First Seen-in on the wiki: it is the model-layer substrate for STAR.

Load-bearing properties¶

Authn / authz. Consumer services (like STAR) do not hold third-party LLM credentials; the proxy enforces Expedia identity and access control on every call.
Model abstraction. The proxy exposes multiple models; callers choose among them or accept proxy defaults. Expedia is "also exploring using different models for the various tasks in STAR" — the proxy makes per-step model selection a configuration change, not an integration rewrite.
Rate limiting. The proxy itself rate-limits; STAR accommodates this with "common resiliency patterns ... asynchronous operations and batch processing".
Centralised evaluation surface. Quality / cost / perf are measured at the proxy tier, not per application — the proxy is the natural place to run A/B tests of new models before rolling them out.

Consumers¶

STAR — Expedia's Service Telemetry Analyzer. Uses the proxy for every prompt in its multi-step RCA workflow.

Stub page¶

This page is a stub built from STAR's description. Architecture details (deployment shape, queueing / batching model, pricing attribution, model-eligibility governance, SSO integration) are not disclosed in the STAR post.

Seen in¶

sources/2026-04-28-expedia-expedias-service-telemetry-analyzer — STAR's named upstream; handles authn/authz, multi-model access, and is rate-limited.

systems/expedia-star — canonical wiki consumer.
concepts/token-heavy-system — the class of workload STAR fits in; the proxy is the natural vantage point for measuring token spend.
concepts/prompt-chaining — STAR's orchestration technique, issued as N proxy calls per workflow.
systems/langfuse — evaluation + tracing substrate upstream of the proxy.
companies/expedia