SYSTEM Cited by 1 source

Apache Lucene¶

Apache Lucene is the Java full-text search library that underpins Elasticsearch and its managed fork Amazon OpenSearch Service. Lucene's on-disk unit of persistence is the segment — an immutable inverted-index file produced by flushing an in-memory buffer; segments are periodically merged into larger segments in the background.

This page is a stub; most wiki references to Lucene come through Elasticsearch / OpenSearch.

Why segments are load-bearing in the wiki¶

Segment-level replication boundary for CCR¶

Elasticsearch's Cross Cluster Replication (see concepts/cross-cluster-replication) replicates data once it's been persisted to Lucene segments. This gives CCR a durable, immutable replication unit: the follower cluster doesn't see in-memory buffer contents or yet-unflushed documents; it only sees whole, persisted segments. GitHub's 2026 GHES search rewrite exploits this: the leader cluster's segments are the durable truth, the follower cluster replays them. (Source: sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability)

Segment-level is coarser than per-document but finer than per-index-snapshot — it's the right grain for streaming near-real-time replication with durability guarantees.

Stub caveats¶

Not covered here: Lucene's index-compaction / merge policy, scoring-function internals (BM25, DFR, ...), codecs, index-time vs query-time analyzer chains, Lucene's relationship to IndexWriter / IndexReader, NRT (near-real-time) search.

Seen in¶

sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability — Lucene segments as the durable replication unit underpinning Elasticsearch CCR in GHES 3.19.1.

systems/elasticsearch — distributed search engine over Lucene.
systems/amazon-opensearch-service — AWS's managed fork, same Lucene core.
concepts/cross-cluster-replication — replicates data at the Lucene-segment granularity.

Apache Lucene¶

Why segments are load-bearing in the wiki¶

Segment-level replication boundary for CCR¶

Stub caveats¶

Seen in¶

Related¶