CONCEPT Cited by 1 source
SQL efficiency ratio¶
Definition¶
SQL efficiency = db.SQL.tup_fetched / db.SQL.tup_returned
— the ratio of rows consumed by clients to rows returned
from storage, measured by the Postgres stats collector. It
is the P5 signal in Zalando's
concepts/golden-signals-rds|12 golden signals for RDS.
"SQL efficiency shows the percentage of rows fetched by the client vs rows returned from the storage. The metric does not necessarily show any performance issue with databases but high ratio of returned vs fetched rows should trigger the question about optimization of SQL queries, schema or indexes. For example, If you do
select count(*) from million_row_table, one million rows will be returned, but only one row will be fetched."
The two counters¶
Postgres maintains two cumulative counters in
pg_stat_database / pg_stat_user_tables:
tup_returned— rows scanned by the storage layer to answer queries. Includes rows filtered out byWHEREpredicates, rows aggregated away, rows that get discarded during joins.tup_fetched— rows actually returned to the client. What the application sees.
Healthy queries read approximately what they return. A primary-key lookup reads 1 row and returns 1 row — ratio 1.0. A well-indexed range query reads 100 rows via index and returns 100 — ratio 1.0.
When the ratio is low¶
tup_fetched / tup_returned near zero means the storage
layer scanned orders of magnitude more rows than the client
saw. The causes are schema / query-shape problems the
execution engine cannot fix on its own:
- Missing index.
WHERE status = 'pending'on a table with no index onstatus→ full scan, then filter. Storage returns N rows, client gets the matching subset. - Aggregation without index.
SELECT COUNT(*) FROM t WHERE x > 100without a suitable index → full scan to count; one row back to client. - Non-sargable predicates.
WHERE UPPER(name) = 'X'prevents index use, forcing a scan. See concepts/wildcard-prefix-non-sargable. - Poor join ordering / bloated intermediate results. The planner reads millions of rows from one side of a join and discards most of them.
- Over-fetching by ORM.
SELECT * FROM large_tablein application code when only a few columns are needed — the column version of the same problem.
Why the ratio is a cause-metric, not a symptom¶
The storage layer is doing its job efficiently — a full scan at indexed speeds is fast. The ratio is not about "the database is slow"; it's about "the application is asking the database to do far more work than necessary." P5 shows query-design malpractice, not database performance.
This makes the ratio particularly useful as a diagnostic pointer after other signals (C1 CPU, D1/D2 IOPS, D3 latency) indicate load. If CPU and IOPS are high and SQL efficiency is low, the remediation is query/schema/index work — not bigger hardware.
The remediation targets¶
A low SQL efficiency ratio points to three kinds of work:
- Indexes — add one where a predicate is filtering a scan. See concepts/mysql-access-type and concepts/index-selectivity for the taxonomy of access paths.
- Schema redesign — materialise an aggregate, add a summary table, denormalise a common query path. See concepts/reference-table-materialization.
- Query rewriting — restructure a non-sargable
predicate, use
EXISTSinstead ofIN (SELECT ...), push filter predicates below joins.
Seen in¶
- sources/2024-02-19-zalando-twelve-golden-signals —
canonicalises the ratio as signal P5 of the
12-golden-signals methodology; gives the
SELECT COUNT(*)example as the archetypal low-ratio workload.