SYSTEM Cited by 1 source
GitHub Issues (search)¶
GitHub Issues is one of the oldest and most-used surfaces on GitHub; its search subsystem processes ~2,000 QPS (≈160 M queries/day). Issues URLs are routinely bookmarked and linked inside docs/PRs — so rewrites of the search path are constrained first by backward compatibility, not by feature ambition.
This page is a stub for the 2025 search rewrite specifically (sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean); Issues as a product (assignees, labels, projects integration, UI) is out of scope.
Search pipeline: Parse → Query → Normalize¶
A single Issues search query flows through three stages:
- Parse — user input string → intermediate structure.
- Query — intermediate structure → Elasticsearch query document → execute against Elasticsearch.
- Normalize — Elasticsearch JSON → Ruby objects, filter out records that were deleted from the source DB after indexing.
The 2025 rewrite replaced stages 1 + 2 (Parse and Query); stage 3 (Normalize) was deliberately untouched because it is about result-shaping rather than query semantics.
2025 rewrite: nested boolean queries¶
Module swap¶
The existing IssuesQuery module was replaced by a new
ConditionalIssuesQuery module that handles both legacy flat
queries and the new nested syntax (is:issue state:open
author:rileybroughten (type:Bug OR type:Epic)).
Parse: flat list → PEG AST¶
Flat-list parsing was replaced with a PEG grammar (via parslet) emitting an Abstract Syntax Tree. The grammar accepts both legacy filter-lists and nested boolean queries as different entry rules, so bookmarked URLs keep working by construction, not by a retry-both-parsers fallback.
Query: linear map → recursive AST traversal¶
The old builder invoked one filter-class per filter-term and
concatenated the resulting Elasticsearch snippets. The new
builder recursively traverses the AST, mapping operators
into Elasticsearch's bool query —
AND → must, OR → should, NOT → should_not — and reusing
the old filter-class snippets at the leaves. Same-field OR-of-
values subtrees get compacted to a single terms clause.
See patterns/ast-based-query-generation for the generalised shape.
Normalize: unchanged¶
JSON → Ruby + post-filter for since-deleted records, same as before.
Validation and rollout¶
Three layered validation techniques:
- Test-suite re-run under both flag states. The existing
unit + integration tests for
IssuesQuerywere run againstConditionalIssuesQuery; the search-endpoint API contract tests (GraphQL + REST) were run with the feature flag enabled AND disabled to verify contract invariance. - Dark-shipping — for 1% of production Issues searches, the query ran against both systems in a background job (the user saw only the old result). Number of results differences (when the two runs happened within ≤1 s of each other) were logged, triaged, and used to fix bugs before GA. See patterns/dark-ship-for-behavior-parity.
- Performance comparison via systems/scientist — for 1% of searches, ran equivalent queries against both systems to compare latency and error profiles before promoting the new path.
Blast-radius bounding¶
The new search integrated first with (a) the GraphQL API and (b) the per-repo Issues tab in the UI, then extended to (c) the Issues dashboard (2025-04-02) and (d) the REST API (2025-03-06). This is a surface-first rollout — bound blast radius by code path set, composed with the per- surface traffic-% rollout.
Also validated internally first (dogfood on GitHub employees), then with trusted partners, before general availability.
Product ceiling¶
Five nesting levels — derived from customer interviews as
the utility-vs-usability sweet spot. Not a technical cap
(parslet + Elasticsearch could handle more). Combined with
highlighted AND/OR keywords and the existing filter-term
auto-complete UI, the depth cap bounds the cognitive load of
writing queries.
Historical note¶
- ~2015: first community request for OR-across-all-fields (isaacs/github#660).
- 2021-08-02: comma-separated values on
label:shipped as implicit-OR stopgap. - 2025-03-06 → 2025-04-09: new boolean syntax rolled out across API + UI + dashboard + REST API over ~5 weeks.
- 2025-05-13: launch post published.
Seen in¶
- sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean — canonical instance. Parse / Query / Normalize pipeline, AST + PEG-grammar rewrite, dark-ship + scientist validation, surface-first rollout, 5-level nesting cap, ~2 kQPS load profile.
Related¶
- systems/github — parent product.
- systems/elasticsearch — backing search engine.
- systems/parslet — PEG parser library used by the 2025 rewrite.
- systems/scientist — critical-path comparison library used for perf validation.
- concepts/abstract-syntax-tree — IR produced by the parse stage.
- concepts/peg-grammar — grammar class used.
- patterns/ast-based-query-generation — the structural rewrite shape.
- patterns/dark-ship-for-behavior-parity — live-traffic behaviour-diff harness used during rollout.