SYSTEM Cited by 1 source

Lyft Feature Store¶

Lyft's Feature Store is the shared ML substrate every model and consuming service at Lyft retrieves features from. In Rohan Varshney's (2026-01-06) characterization, it is a "platform of platforms" — not a single pipeline but three complementary ingest/serve lanes (batch, streaming, direct-CRUD) composed on top of one unified online-serving layer, dsfeatures.

What it is concretely¶

Batch lane — Spark SQL query + JSON config; a Python cron service auto-generates an Astronomer-hosted Airflow DAG per config that executes the query, writes to both offline (Hive) and online (dsfeatures) paths, runs data-quality checks, and tags Amundsen for discoverability. Typical cadence: daily.
Streaming lane — customer Flink applications read events from Kafka (sometimes Kinesis), transform, and emit feature payloads to the central spfeaturesingest Flink app — which owns (de)serialization and dsfeatures WRITE API interaction.
Direct-CRUD lane — go-lyft-features (Go) and lyft-dsp-features (Python) SDKs expose full CRUD on dsfeatures so internal DAGs and customer services can read and write features ad-hoc without going through the ingestion pipelines.
Online serving layer — dsfeatures, an optimized wrapper over DynamoDB (backing), ValKey (write- through LRU cache), and OpenSearch (embeddings).

Design invariants¶

Uniform metadata + strongly-consistent reads across all ingestion paths. "Regardless of the ingestion method (batch, streaming, or on-demand), the Feature Store maintains uniform metadata and strongly consistent reads." The streaming-lane choke-point (spfeaturesingest) is the enforcement mechanism: you can't have uniform metadata if every producer writes its own way.
Feature definition = SparkSQL query + JSON config. No DSL. The platform absorbs the boilerplate (DAG generation, data-quality checks, offline/online double-write, metadata tagging); customers own only feature-specific SQL + metadata.
Metadata-driven governance. Each JSON config carries ownership, urgency tier, carryover / rollup logic, explicit naming and data-typing, versioning semantics, lineage. Versioning rule is named: "If the SQL or expected feature behavior undergoes business logic changes, a version bump is expected."
Amundsen-first discoverability. Generated DAGs automatically tag feature metadata in Amundsen, so finding an existing feature is a search, not a tribal-knowledge exercise.

Feature store — the concept; Lyft is the second major instance on the wiki alongside Dropbox Dash.
Hybrid batch + streaming ingestion — canonical instance: the three-lanes shape maps one-to-one to Dropbox Dash's batch + streaming + direct-write lanes.
Config-driven DAG generation — canonical instance.
patterns/wrapper-over-heterogeneous-stores-as-serving-layer — canonical instance via dsfeatures.
patterns/batch-plus-streaming-plus-ondemand-feature-serving — the three-lanes-with-unified-online-surface shape.

Relationship to LyftLearn¶

The Feature Store sits adjacent to LyftLearn / systems/lyftlearn-serving / systems/lyftlearn-compute — Lyft's ML training + serving platform. The Feature Store is the data substrate; LyftLearn is the compute substrate. Models train on feature-store output (Hive offline tables) and serve against dsfeatures online reads.

Seen in¶

sources/2026-01-06-lyft-feature-store-architecture-optimization-and-evolution — canonical wiki introduction; Rohan Varshney's Lyft Engineering post describing the three-lane architecture, dsfeatures online layer, governance model, and Amundsen integration.

Lyft Feature Store¶

What it is concretely¶

Design invariants¶

Related patterns on this wiki¶

Relationship to LyftLearn¶

Seen in¶

Related¶