Start

Module status: see individual classes below

Because we are now covering very recent developments, which change every year, there are no videos for this module. We’ll cover everything in class.

  • 2026-03-10 Neural speech processing (vocoders; audio codecs; representation learning)
    • Status: ready
    • we need to revisit representations of both text and speech; the key advance will be to find a discrete representation of speech
  • 2026-03-17 Large Speech Language Models
    • Status: ready
    • a discrete representation of speech will enable us to use models that can only generate discrete representations: language models
  • 2026-03-24 Beyond Text-to-Speech (cloning, conversion, anonymisation,…)
    • Status: not ready
    • yes, there is more to life than TTS! We don’t have to limit ourselves to textual input!