CONCEPT Cited by 1 source

Feature discoverability¶

Definition¶

Feature discoverability is the property of a feature store (or adjacent ML data platform) that engineers and modelers can find an existing feature before creating a new one — usually through a search-first catalog UI indexed on feature name, owner, tag, lineage, data type, and natural-language description.

Without it, the default failure mode of a large ML org is duplicate feature definitions across teams, each with slightly different semantics and drift. The cost is triple: wasted engineering work, duplicated compute pipelines, and model-quality risk when two teams build rankers against features they believed were the same but weren't.

What makes discoverability work¶

Rich feature metadata — ownership, urgency tier, transformation logic, value semantics, data type, versioning, lineage. The feature definition must declare what it is, not just compute a number.
Automatic catalog population — the pipeline that produces the feature should also publish the metadata to the catalog. Human-maintained catalogs rot.
Search UX — free-text search over feature names, descriptions, and tags; plus structured filters over owner, tier, and lineage.
Accessible to both engineering personas — the catalog should be findable for ML modelers (who design features) and software engineers (who integrate them into services).

Seen in¶

sources/2026-01-06-lyft-feature-store-architecture-optimization-and-evolution — Lyft's Feature Store outsources discoverability to Amundsen (Lyft's own open-source data-discovery platform). The auto-generated Airflow DAGs tag feature metadata in Amundsen as a pipeline side-effect. Rohan Varshney frames it plainly: "Once features are generating data, discoverability is the next crucial step. [...] This integration allows users to easily search for existing features, a critical step in preventing the duplication of efforts and reducing wasted engineering work."

concepts/feature-store
concepts/feature-freshness
systems/amundsen
systems/lyft-feature-store
patterns/config-driven-dag-generation — the mechanism that makes automatic catalog population work at Lyft: metadata tagging happens inside the generated DAG itself, not as a separate step.

Feature discoverability¶

Definition¶

What makes discoverability work¶

Seen in¶

Related¶