Skip to content

SYSTEM Cited by 1 source

Expedia generative-AI proxy

Definition

Expedia's internal generative-AI proxy is a company-level LLM choke point: a central service that brokers LLM invocations for internal consumers, handles authentication + authorization, and exposes multiple models that the platform team "constantly evaluate[s] for quality of results, cost, and performance implications" (Source: sources/2026-04-28-expedia-expedias-service-telemetry-analyzer).

First Seen-in on the wiki: it is the model-layer substrate for STAR.

Load-bearing properties

  • Authn / authz. Consumer services (like STAR) do not hold third-party LLM credentials; the proxy enforces Expedia identity and access control on every call.
  • Model abstraction. The proxy exposes multiple models; callers choose among them or accept proxy defaults. Expedia is "also exploring using different models for the various tasks in STAR" — the proxy makes per-step model selection a configuration change, not an integration rewrite.
  • Rate limiting. The proxy itself rate-limits; STAR accommodates this with "common resiliency patterns ... asynchronous operations and batch processing".
  • Centralised evaluation surface. Quality / cost / perf are measured at the proxy tier, not per application — the proxy is the natural place to run A/B tests of new models before rolling them out.

Consumers

  • STAR — Expedia's Service Telemetry Analyzer. Uses the proxy for every prompt in its multi-step RCA workflow.

Stub page

This page is a stub built from STAR's description. Architecture details (deployment shape, queueing / batching model, pricing attribution, model-eligibility governance, SSO integration) are not disclosed in the STAR post.

Seen in

Last updated · 433 distilled / 1,256 read