CONCEPT Cited by 1 source
Psychoacoustic compression¶
Definition¶
Psychoacoustic compression exploits properties of the human auditory system — masking, perceptual thresholds, and the specific statistical structure of human voice — to discard information the listener cannot perceive. It is the primary mechanism by which audio codecs achieve 25–30× compression ratios without proportional quality loss (Source: sources/2024-06-13-meta-mlow-metas-low-bitrate-audio-codec).
Mechanism¶
From Meta's MLow post:
"Good codecs can strike a balance among the trio of quality, bitrate, and complexity by exploiting deep knowledge about the nature of the audio signal as well as by using psychoacoustics."
The post does not unpack which psychoacoustic phenomena MLow exploits. In the general codec literature these include:
- Frequency masking — loud tones hide quieter tones nearby in frequency.
- Temporal masking — loud sounds hide quieter sounds immediately before/after.
- Voice-band prioritisation — the ear is most sensitive roughly 0.5–4 kHz, which is why NarrowBand mode is "almost but not quite" sufficient for voice and fails at naturalness.
Why it matters for codec design¶
Psychoacoustic insight is the non-obvious lever that moves a codec along the quality / bitrate / complexity triangle: better psychoacoustic modelling lets you hold quality constant while cutting bitrate, or cut complexity by discarding imperceptible information earlier in the pipeline.
Seen in¶
- sources/2024-06-13-meta-mlow-metas-low-bitrate-audio-codec — cited as one of the two levers ("deep knowledge of the signal" + "psychoacoustics") that good codecs pull.