A key step in parameterising speech is to move from the time domain to a domain in which distances make more sense, and so where we can perform pattern matching.
There is no video here. Revise the following concepts from earlier in the course:
- The spectrum, obtained by Fourier transform, is one possibility for a feature vector
- A better option would be filterbank features, a bit like the cochlea produces
- Yet another option could be the filter co-efficients of a source filter model