SYSTEM Cited by 7 sources
Elasticsearch¶
Elasticsearch is the Apache-Lucene-based distributed search
engine that powers a large share of production full-text + filtered
search at scale. It exposes a JSON-based Query DSL; the query
document family relevant to most structured search products is the
bool query, which takes nested must / should / must_not /
filter / should_not clauses corresponding naturally to
AND / OR / NOT logic.
Within the wiki this page is a stub created for cross-referencing from sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean; Elasticsearch is a large product with many capabilities not covered here (search relevance, aggregations, k-NN vector search, ILM / snapshotting, cross-cluster replication).
AWS's managed fork is Amazon OpenSearch Service, which uses the same bool-query shape.
Role in the wiki¶
Backing store for GitHub Issues search (2025-05-13)¶
GitHub Issues search is backed by Elasticsearch. The 2025 rewrite's Query pipeline stage compiles an AST from user search input into a nested Elasticsearch bool query:
| AST node | Elasticsearch bool clause |
|---|---|
AND |
must |
OR |
should |
NOT |
should_not (or must_not) |
leaf filter-term (author:monalisa) |
term / terms / prefix |
The recursive mapping is the natural codomain for an AST-driven
search DSL:
patterns/ast-based-query-generation is the structural fit. A
worked before-after is in the source page. Same-field OR-of-values
subtrees get compacted into a single terms clause as an intra-
leaf optimization.
Scale: GitHub Issues search runs at ~2,000 QPS (≈160 M queries/day) on this substrate. (Source: sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean)
Bool query shape (the relevant API surface)¶
The canonical nested shape Elasticsearch exposes:
{
"query": {
"bool": {
"must": [ ... ], // AND: all must match
"should": [ ... ], // OR-like: used for scoring, or for match when no must
"must_not": [ ... ], // NOT: none must match
"filter": [ ... ] // AND with no scoring contribution
}
}
}
bool clauses nest inside each other, which is why any
boolean-algebra AST can be emitted mechanically as a tree of bool
objects with leaf clauses at the bottom.
The filter vs must distinction matters for relevance scoring:
filter short-circuits the score computation, which is the right
choice for structured-equality predicates (state:open,
author_id:X). Full-text term queries typically go in must to
participate in scoring.
See the upstream reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
Cross Cluster Replication (CCR) — replication between clusters¶
Elasticsearch supports two distinct replication mechanisms:
- Intra-cluster primary/replica shard replication — primary and replica shards of the same index live in one cluster; ES rebalances shards across the cluster's nodes as a health action.
- Cross Cluster Replication (CCR) — one-way leader→follower replication between otherwise-independent ES clusters, at the Lucene segment granularity. Covered fully at concepts/cross-cluster-replication.
CCR's structural win is that it lets you align the storage topology to the application's primary/replica topology (see concepts/primary-replica-topology-alignment). The canonical wiki instance is the 2026-03-03 GHES search rewrite — GHES collapsed a multi-node ES cluster spanning its HA pair into per-node single-node clusters and joined them with CCR, removing ES's freedom to rebalance primary shards onto the read-only replica host (the old failure mode that caused mutual-dependency deadlocks). See patterns/single-node-cluster-per-app-replica.
CCR's auto-follow policy is new-only — it matches indexes created after the policy is installed and doesn't retroactively attach pre-existing indexes. Applying CCR to a long-lived deployment therefore requires an imperative bootstrap step for pre-existing indexes followed by the declarative auto-follow policy for future ones.
CCR only covers document replication. Failover, index deletion, and upgrades are the consumer's responsibility — "Elasticsearch only handles the document replication, and we're responsible for the rest of the index's lifecycle" (GitHub, 2026-03-03).
Shard-allocation awareness + drain livelock (2024-06-20)¶
The canonicalises a tricky interaction between three Elasticsearch primitives:
- Shard-allocation awareness — Elasticsearch is configured with an
awareness.attributesvalue (AZ, rack, host) and refuses to place two copies of a shard on nodes sharing the attribute value. cluster.routing.allocation.exclude._ip— the runtime-mutable exclusion list used to tell Elasticsearch "don't place shards on these node IPs"; used by operators as the drain primitive.- Zone-spread invariant — the combination of the above refuses to relocate a node's shards when the node is the only one in its awareness group (e.g. only pod in an AZ).
The drain-stuck-on-last-pod-in-zone failure mode (concepts/zone-aware-shard-allocation-stuck-drain) is inherent — Elasticsearch is correct to refuse the violation — but it turns a livelock into a correctness-critical operator-pattern concern. Combined with the Zalando-disclosed zombie-exclusion-list partial-failure bug in es-operator, this produced three consecutive morning scale-out failures at Zalando Lounge and the canonical wiki posture of "read the source code of the orchestrating operator when the abstract model doesn't explain the symptom."
Stub caveats¶
- This page covers only what the ingested sources touch. Not
covered: relevance/
_scoretuning, aggregations, ILM, k-NN vector search, Elasticsearch SQL, cross-cluster replication, snapshot lifecycle, or operational runbooks (shard sizing, circuit breakers, mapping explosion). - The open-source / licensing split between Elasticsearch (Elastic) and OpenSearch (AWS fork) is not modelled here; the bool-query DSL is common to both.
Seen in¶
- sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean — backing engine for GitHub Issues search at ~2 kQPS; bool-query shape as codomain for the AST-based query builder.
- sources/2026-04-21-figma-the-search-for-speed-in-figma — Figma ran on Elasticsearch until late 2023 then migrated to managed OpenSearch. The post notes that while the two remain "mostly compatible, small differences have accumulated over the last three years, making the migration more challenging than expected" — a concrete data point on the cost of a drift-after-fork that is cited but not enumerated in this post.
- sources/2026-03-03-github-how-we-rebuilt-the-search-architecture-for-high-availability — GHES 3.19.1 uses Elasticsearch CCR to replicate Lucene segments between per-node single-node clusters, replacing a failure-prone cross-node-cluster topology. Canonical wiki instance of the cross-cluster-replication primitive and of patterns/single-node-cluster-per-app-replica. Also the wiki's first real engagement with CCR's auto-follow new-only limitation and the patterns/bootstrap-then-auto-follow workaround.
- sources/2025-11-04-datadog-replication-redefined-multi-tenant-cdc-platform — Elasticsearch is the original destination system of Datadog's managed multi-tenant CDC replication platform. Datadog's Metrics Summary page was joining 82K × 817K rows on a shared Postgres, hitting p90 ~7 s. Rerouting search + faceted filtering to Elasticsearch, populated by a Debezium → Kafka → sink-connector pipeline with dynamic denormalisation at replication time, dropped page load ~30 s → ~1 s (up to 97%) at ~500 ms replication lag. Canonical wiki instance of the split direction of the database-and-search problem (Elasticsearch beside Postgres fed by CDC), as distinct from Cars24's Atlas-Search consolidation.
Related¶
- systems/lucene — the underlying storage/indexing engine; CCR replicates at the Lucene-segment grain.
- systems/github-enterprise-server — canonical CCR consumer.
- systems/mongodb-atlas — competing consolidated primary-plus- embedded-search substrate (Atlas + Atlas Search); Cars24's target of the bolt-on-search elimination.
-
systems/atlas-hybrid-search — MongoDB's BM25-on-Lucene + vector peer; the consolidation destination Cars24 chose to replace an Elasticsearch-class bolt-on.
-
systems/amazon-opensearch-service — AWS's managed fork, same bool-query shape.
- concepts/boolean-query-dsl — the AND/OR/NOT nested-clause API shape Elasticsearch popularised.
- concepts/query-shape — flat vs nested vs recursive as distinct backend load profiles; Elasticsearch exposes all three.
- patterns/ast-based-query-generation — recursive-AST-to-bool query emission.
- concepts/cross-cluster-replication — CCR concept page.
- concepts/primary-replica-topology-alignment — structural principle the GHES rewrite exemplifies on top of CCR.
- patterns/single-node-cluster-per-app-replica — deployment pattern that collapses a multi-node ES cluster into per-host single-node clusters joined by CCR.
- patterns/bootstrap-then-auto-follow — imperative-then- declarative pattern for new-only CCR auto-follow policies.
- concepts/synchronization-tax — cost class Elasticsearch-as- bolt-on-search beside a primary RDBMS exemplifies; Cars24 (2025-10-12) is the canonical wiki instance of leaving that shape.
- patterns/consolidate-database-and-search — the remediation pattern that obsoletes the Elasticsearch-beside-Postgres shape.
Seen in (migration off Elasticsearch)¶
- sources/2025-05-08-yelp-nrtsearch-100-incremental-backups-lucene-10 — Yelp has migrated >90% of former Elasticsearch traffic to Nrtsearch, their in-house Lucene-based search engine (open source at github.com/Yelp/nrtsearch). Nrtsearch 1.0.0 (2025-05) consolidates the architectural moves that differentiate it from Elasticsearch: incremental backup on commit to S3 as source of truth; ephemeral local SSD over EBS on the primary; external Coordinator over per-shard clusters (virtual sharding, to be replaced by Lucene 10's intra-single-segment parallel search); immutable index state with hot reload on replicas. One of the wiki's two canonical first-party Elasticsearch-migration instances at Tier-3-blog scale (pair: Yelp → Nrtsearch).
Seen in (legacy / archetypal side)¶
- sources/2025-10-12-mongodb-cars24-improves-search-for-300-million-users-with-atlas — post names "bolt-on search engine (such as Elasticsearch)" as the canonical example of the legacy search shape Cars24 left to consolidate on Atlas + Atlas Search. Cars24 had multiple engineering teams piping data into a single search index with race-logic + real-time-dashboard-update inefficiencies. The class is archetypal, not Cars24-specific; the wiki treats this as one instance of the synchronization-tax shape.
Seen in (operator drain failures)¶
- — canonical wiki instance of shard-allocation awareness as a drain-livelock producer. Zalando Lounge runs a 3-AZ Elasticsearch cluster on Kubernetes via es-operator; zone-aware shard allocation refused to relocate shards from the last pod in one AZ, stalling the operator's 999-retry drain loop overnight. Two es-operator bugs uncovered by the trace-through-source code session: (1) ctx-cancellation ignored in one retry loop (PR #405), (2) zombie exclusion-list state when drain is interrupted between mark and cleanup (WIP PR #423). Closing lesson "Read the code. For solving difficult problems, understanding the related processes in abstract terms might not be enough."
Seen in (multimodal video search)- sources/2026-04-04-netflix-powering-multimodal-intelligence-for-video-search¶
— canonical wiki instance of nested-document indexing for
cross-modality query. Netflix's multimodal video-search index
stores each temporal bucket as a root Elasticsearch document
(associated_ids, time_bucket_start_ns,
time_bucket_end_ns) with source_annotations typed nested
carrying heterogeneous per-modality child docs (CHARACTER_SEARCH
with label; SCENE_SEARCH with label + embedding_vector).
Document _id is the composite (asset_id, time_bucket) making
model re-runs idempotent via
composite-key upsert. The
nested shape preserves cross-annotation-within-same-bucket
semantics — "find buckets where a character with label Joey
co-occurs with a scene annotation with label kitchen" — that a
flat document model can't express. Netflix's framing: "this
hierarchical data model is precisely what empowers users to
execute highly efficient, cross-annotation queries at scale."
See patterns/nested-elasticsearch-for-multimodal-query and
patterns/three-stage-ingest-fusion-index.
Seen in (self-inflicted DoS via high-cardinality faceting)¶
- sources/2025-12-16-zalando-the-day-our-own-queries-dosed-us-inside-zalando-search
— canonical wiki instance of
self-inflicted DoS via
high-cardinality
termsaggregation overload on Elasticsearch. An internal Zalando application, triggered by an automated maintenance workload + a processing-logic bug, issued 20–100 req/s of faceting queries whosetermsaggregation was on the SKU field (unique product ID, cardinality ~100M). The per-query cost on thesearchthread pool was high enough that this low-volume stream (against a cluster serving "thousands of req/s") starved the pool and pinned coordinator CPU across the cluster serving two of Zalando's largest markets. The scatter-gather mechanics, the coordinator-node tier, and Adaptive Replica Selection are described verbatim in the post; Zalando explicitly notes that ES 8.12+search_workerpool parallel collectors do not accelerate high-cardinalityterms— those run single-threaded per shard onsearch. Mitigation path: a market split vianode.attr.marketallocation filters to isolate the problem to one market, plus a 5-lever app-side load-shed (reduce replicas, throttle ingestion, disable non-critical calls, down-sample ML / promotions enrichment). Root cause identified via a Lightstep trace- exploration notebook that spotted the caller at 50× baseline fan-out. Follow-up program: per-client slow-query attribution viaX-Opaque-Id(concepts/x-opaque-id-client-attribution / patterns/per-client-slow-query-dashboard), app-side query cost limits with dynamic thresholds (patterns/application-side-query-limit-with-dynamic-threshold), cluster-widesearch.max_bucketsguardrail (patterns/cluster-wide-aggregation-guardrail), and per- client rate limiting. The wiki's composite-identity page for the Zalando search substrate is systems/zalando-catalog-search; the Elasticsearch tier within it is systems/zalando-base-search. The closing lesson is the clinical-diagnostics aphorism inverted: concepts/zebra-not-horse-heuristic — "when you hear hoofbeats, think horses, not zebras. Because horses are common, and zebras are rare. But in our case, it happened to be a zebra."
Seen in (catalog discovery on a metadata graph)¶
- sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph
— canonical wiki instance of single unified
entitiesindex withentityTypediscriminator for an ML metadata catalog. Netflix MDS indexes models, features, pipelines, datasets, and A/B tests in oneentitiesindex (differentiated by anentityTypefield) plus a separateownersindex. Free-text search over a model name becomes one query against one index; faceted filters onentityType, ownership, tags, and domain-specific attributes (stored as key-value tag pairs liketeam::personalization,env::production,model.state::released) compose post-search. Relevance boosting ensures exact name matches score significantly higher than fuzzy/related-metadata matches — the canonical search-quality lever for catalog UX. ES is the discovery surface in MDS's dual-store pattern; Datomic is the navigation surface for multi-hop graph traversal queries. Re-indexing is triggered both synchronously on the ingest path (after the Datomic write) and asynchronously on enrichment (background jobs that derive new edges re-index the affected entities so the relationship-metadatarelatedfield stays current).