Skip to content

SYSTEM Cited by 1 source

LSIF (Language Server Index Format)

LSIFLanguage Server Index Format — is an open format in the Language Server Protocol (LSP) ecosystem for storing precomputed code-navigation data (definitions, references, hovers) so that IDEs and code browsers don't have to re-analyse the source on every query.

This wiki treats LSIF as a stub — introduced here because Meta's 2024-12-19 Glean post cites it explicitly as the "well-established format… used by IDEs that caches information about code navigation" against which Glean's design is contrasted.

Origin

LSIF was developed by Microsoft as part of the Language Server Protocol family. It's designed as a serialisable, vendor-neutral representation of what a language server would compute dynamically: a graph of code entities + the navigation edges between them (definition, reference, hover, implementation, type-definition).

Shape

  • Graph of vertices and edges. Documents, ranges, result-sets, hover-results, definition-results, reference-results are vertices; edges connect them.
  • Standard LSP-shaped queries — the same operations an LSP server would answer at runtime, but served from a precomputed index.
  • Per-project indexes. LSIF indexes are typically produced per project / per repo by a language-specific indexer.

Why Glean chose a different shape

Meta's post names LSIF as the ecosystem's incumbent and explicitly notes that Glean "wasn't tied either to particular programming languages or to any particular use case" — a deliberate generality step beyond LSIF's LSP-centric operation set. Specifically:

  • LSIF's query surface is fixed by LSP: definition, references, hover, etc. Glean's is Angle, a general declarative query language.
  • LSIF is a file format for serialising an indexer's output. Glean is a service with replicated storage, network query, and cross- revision stacking (see concepts/stacked-immutable-databases).
  • LSIF's data model is the LSP feature-set. Glean stores "arbitrary non-programming-language data too" — which is what enables Meta's dead-code / API-migration / test-selection / RAG use cases that are out of scope for LSIF. See the full list on systems/glean.

This is not a contradiction in the wiki sense — LSIF and Glean solve overlapping problems from different starting points. LSIF optimises for easy integration with the LSP ecosystem; Glean optimises for serving a single hyperscale monorepo at Meta.

Seen in

Last updated · 319 distilled / 1,201 read