In sampling and quantisation we saw that sampling a signal at a fixed rate means that there is an upper limit on the frequencies that can be represented. This limit is called the Nyquist frequency. Before sampling a signal, we must remove all energy above the Nyquist frequency, and here we will see what would happen if we forgot to do that: we would get aliasing which results in artefacts in the resulting digital signal.
Try it for yourself – here are the materials to download (I recommend downloading and playing these in an audio application; web browsers do not always handle wav files correctly):
- Original waveform at 16kHz sample rate: kdt_001
- Downsampled to 8kHz
- correctly: kdt_001_correct8000
- incorrectly: kdt_001_aliased8000
- Downsampled to 4kHz
- correctly: kdt_001_correct4000
- incorrectly: kdt_001_aliased4000
I performed the downsampling like this, and the incorrect method simply takes every 2nd or 4th sample from the file (that’s what the awk command is doing to an ascii version of the waveform, one sample per line):
for N in 4000 8000 do R=$[16000 / $N] echo Ratio is $R # incorrect downsampling, with no low pass filter x2x +s +a kdt_001.wav \ | awk '!(NR%'${R}')' \ | x2x +a +s \ | ch_wave -f ${N} -F ${N} -itype raw -otype riff -o kdt_001_aliased${N}.wav # now correct downsampling, which includes low-pass filtering ch_wave -f 16000 -F ${N} -otype riff kdt_001.wav -o kdt_001_correct${N}.wav done
x2x is part of SPTK and ch_wave is part of the Edinburgh Speech Tools