CONCEPT Cited by 1 source
Boolean query DSL¶
A boolean query DSL is a structured query-document shape in which clauses combine via explicit AND / OR / NOT operators — usually rendered as nested tree objects — so that an upstream parser/compiler can emit arbitrarily-deep logical combinations of filter terms.
The canonical example is Elasticsearch's
bool query, but the shape is generic and appears across most
production search and query engines (OpenSearch, Solr, Vespa,
MongoDB's $and / $or, SQL WHERE ASTs internally, etc.).
The canonical Elasticsearch shape¶
{
"query": {
"bool": {
"must": [ ... ], // AND (participates in scoring)
"should": [ ... ], // OR (or scoring boost)
"must_not": [ ... ], // NOT (eliminates matches)
"filter": [ ... ] // AND without scoring contribution
}
}
}
Nested bool clauses inside any of must/should/must_not/filter
give you recursive boolean combinations of any depth. This is
the property that makes the DSL a natural emission target for an
AST traversal.
must vs filter¶
Both AND-join clauses. The difference is relevance scoring:
- must clauses contribute to _score (used to rank results).
- filter clauses short-circuit the scoring computation — used
for structured-equality predicates (state:open, author_id:X)
where ranking contribution is not wanted.
For UI-driven search like GitHub Issues, filter predicates
dominate, so the AST-to-bool-query compiler can default to
filter / must_not and only use must when a full-text term
is present in the input.
The AST-to-bool-query mapping¶
The rewrite of a boolean-DSL-compatible AST into an Elasticsearch bool-query document is essentially one-to-one on operators:
| AST node | Elasticsearch bool clause |
|---|---|
AND(a, b) |
{bool: {must: [<a>, <b>]}} (or filter if non-scoring) |
OR(a, b) |
{bool: {should: [<a>, <b>], minimum_should_match: 1}} |
NOT(a) |
{bool: {must_not: [<a>]}} |
leaf filter term K:V |
backend-specific term/terms/prefix clause |
One intra-leaf optimisation that real compilers make: same-field
OR-of-values (e.g. author:A OR author:B OR author:C) collapses
to a single terms clause:
instead of three nested should clauses. GitHub's rewrite ships
this optimisation. (Source:
sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean)
Why the boolean DSL is the "right" codomain¶
- Composability: any boolean expression, however deeply nested,
has a
boolquery emission of the same depth. - Scoring-aware:
mustvsfilterseparation maps cleanly to "does this term influence ranking?" — a question every search product has to answer. - Backend-neutral enough: the shape is idiomatic in Elasticsearch, OpenSearch, and other derivatives; porting an AST-emitter across them is mostly leaf-clause adjustments.
Caveats and limits¶
- Implicit scoring contribution: when all clauses are
must, the Elasticsearch scorer combines sub-scores in ways that can surprise product owners. Forcefilteron equality predicates unless scoring is explicitly desired. - Depth limits: Elasticsearch caps boolean nesting at a
configurable limit (
indices.query.bool.max_clause_count, default 1,024 leaves). Product-level caps (GitHub Issues' 5 levels) usually intersect this first, but for machine- generated queries the backend limit is real. shouldwithminimum_should_matchrequires care: aboolclause that has onlyshouldchildren defaults to OR-any (minimum 1); aboolclause that mixesmustandshouldmakesshouldcontribute to scoring but not matching. AST compilers should emitminimum_should_match: 1explicitly whenever the AST says OR.- The DSL is not SQL. Correlated subqueries, joins across indices, aggregations as filters — none of these map cleanly through bool. Once a query language wants those, a different codomain (or a layer above Elasticsearch) is needed.
Seen in¶
- sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean
— GitHub's Issues-search rewrite compiles its AST into
Elasticsearch bool-query documents, with
AND→must,OR→should,NOT→should_not, and same-field OR-of-values compacting toterms. Worked example in the source page.
Related¶
- systems/elasticsearch — the canonical engine exposing this DSL.
- systems/amazon-opensearch-service — AWS's fork; same shape.
- concepts/abstract-syntax-tree — the upstream IR.
- patterns/ast-based-query-generation — the end-to-end compiler shape.
- concepts/query-shape — flat vs nested vs recursive query profiles at the backend.