SYSTEM Cited by 1 source
Amundsen¶
Amundsen is Lyft's open-source data-discovery + metadata platform, originally built by the Lyft Data Platform team in 2018 and contributed to the Linux Foundation AI & Data Foundation in 2020. Architecturally it is a search-first catalog:
- Metadata is ingested from source systems (data warehouses, streaming platforms, ML feature stores, notebooks, dashboards) into Amundsen's graph backend.
- A frontend UI lets engineers search for tables, features, dashboards, and users by name, tag, owner, or free-text.
- Backing stores are pluggable — commonly Neo4j / JanusGraph for the metadata graph, Elasticsearch for the search index.
It is the best-known member of the "internal data discovery platform" category (alongside LinkedIn DataHub, Google's Data Catalog, Apple's Apollo, Airbnb's Dataportal).
Typical role for this wiki¶
Amundsen appears as the discoverability layer attached to data / feature platforms — the search front-end that saves engineers from reinventing features, re-computing datasets, or emailing the original owner to ask what a column means.
Seen in¶
- sources/2026-01-06-lyft-feature-store-architecture-optimization-and-evolution — the Lyft Feature Store integrates with Amundsen as its feature discoverability layer: the auto-generated Airflow DAGs tag feature metadata in Amundsen as a pipeline side-effect. Users searching for "riders in city X last 7d" find existing features before writing new ones — feature discoverability as a pipeline side-effect, not a separate ceremony. Directly prevents duplicated feature-engineering effort across teams.