Skip to content

SYSTEM Cited by 1 source

GitHub Issues (search)

GitHub Issues is one of the oldest and most-used surfaces on GitHub; its search subsystem processes ~2,000 QPS (≈160 M queries/day). Issues URLs are routinely bookmarked and linked inside docs/PRs — so rewrites of the search path are constrained first by backward compatibility, not by feature ambition.

This page is a stub for the 2025 search rewrite specifically (sources/2025-05-13-github-github-issues-search-now-supports-nested-queries-and-boolean); Issues as a product (assignees, labels, projects integration, UI) is out of scope.

Search pipeline: Parse → Query → Normalize

A single Issues search query flows through three stages:

  1. Parse — user input string → intermediate structure.
  2. Query — intermediate structure → Elasticsearch query document → execute against Elasticsearch.
  3. Normalize — Elasticsearch JSON → Ruby objects, filter out records that were deleted from the source DB after indexing.

The 2025 rewrite replaced stages 1 + 2 (Parse and Query); stage 3 (Normalize) was deliberately untouched because it is about result-shaping rather than query semantics.

2025 rewrite: nested boolean queries

Module swap

The existing IssuesQuery module was replaced by a new ConditionalIssuesQuery module that handles both legacy flat queries and the new nested syntax (is:issue state:open author:rileybroughten (type:Bug OR type:Epic)).

Parse: flat list → PEG AST

Flat-list parsing was replaced with a PEG grammar (via parslet) emitting an Abstract Syntax Tree. The grammar accepts both legacy filter-lists and nested boolean queries as different entry rules, so bookmarked URLs keep working by construction, not by a retry-both-parsers fallback.

Query: linear map → recursive AST traversal

The old builder invoked one filter-class per filter-term and concatenated the resulting Elasticsearch snippets. The new builder recursively traverses the AST, mapping operators into Elasticsearch's bool query — AND → must, OR → should, NOT → should_not — and reusing the old filter-class snippets at the leaves. Same-field OR-of- values subtrees get compacted to a single terms clause.

See patterns/ast-based-query-generation for the generalised shape.

Normalize: unchanged

JSON → Ruby + post-filter for since-deleted records, same as before.

Validation and rollout

Three layered validation techniques:

  1. Test-suite re-run under both flag states. The existing unit + integration tests for IssuesQuery were run against ConditionalIssuesQuery; the search-endpoint API contract tests (GraphQL + REST) were run with the feature flag enabled AND disabled to verify contract invariance.
  2. Dark-shipping — for 1% of production Issues searches, the query ran against both systems in a background job (the user saw only the old result). Number of results differences (when the two runs happened within ≤1 s of each other) were logged, triaged, and used to fix bugs before GA. See patterns/dark-ship-for-behavior-parity.
  3. Performance comparison via systems/scientist — for 1% of searches, ran equivalent queries against both systems to compare latency and error profiles before promoting the new path.

Blast-radius bounding

The new search integrated first with (a) the GraphQL API and (b) the per-repo Issues tab in the UI, then extended to (c) the Issues dashboard (2025-04-02) and (d) the REST API (2025-03-06). This is a surface-first rollout — bound blast radius by code path set, composed with the per- surface traffic-% rollout.

Also validated internally first (dogfood on GitHub employees), then with trusted partners, before general availability.

Product ceiling

Five nesting levels — derived from customer interviews as the utility-vs-usability sweet spot. Not a technical cap (parslet + Elasticsearch could handle more). Combined with highlighted AND/OR keywords and the existing filter-term auto-complete UI, the depth cap bounds the cognitive load of writing queries.

Historical note

  • ~2015: first community request for OR-across-all-fields (isaacs/github#660).
  • 2021-08-02: comma-separated values on label: shipped as implicit-OR stopgap.
  • 2025-03-06 → 2025-04-09: new boolean syntax rolled out across API + UI + dashboard + REST API over ~5 weeks.
  • 2025-05-13: launch post published.

Seen in

Last updated · 200 distilled / 1,178 read