The Physics of Sound
Sound is a mechanical wave — vibrations propagating through air. Key properties: frequency (pitch, Hz), amplitude (loudness, dB). Human hearing spans 20 Hz to 20 kHz; speech occupies 300 Hz–3,400 Hz.
| Property | Unit | Speech Range | Perception |
|---|---|---|---|
| Frequency | Hertz (Hz) | 80–300 Hz (fundamental) | Pitch — high vs. low |
| Amplitude | Decibels (dB) | 40–80 dB normal speech | Loudness |
| Duration | Milliseconds (ms) | 50–300 ms per phoneme | Rhythm, prosody |
| Sample Rate | Hz (samples/sec) | 16,000 Hz (ASR standard) | Audio quality |
NOTE: Phonemes: English has ~44 phonemes — the smallest units of sound that distinguish meaning (e.g., /p/ vs /b/). ASR models ultimately map acoustic features to phoneme sequences, then to words.