Module status: see individual classes below
Because we are now covering very recent developments, which change every year, there are no videos for this module. We’ll cover everything in class.
- 2026-03-10 Neural speech processing (vocoders; audio codecs; representation learning)
- Status: ready
- we need to revisit representations of both text and speech; the key advance will be to find a discrete representation of speech
- 2026-03-17 Large Speech Language Models
- Status: ready
- a discrete representation of speech will enable us to use models that can only generate discrete representations: language models
- 2026-03-24 Beyond Text-to-Speech (cloning, conversion, anonymisation,…)
- Status: not ready
- yes, there is more to life than TTS! We don’t have to limit ourselves to textual input!