Status: see individual classes below
Slides
- 2026-03-10 Neural speech processing (vocoders; audio codecs; representation learning)
- Download the slides for the class on 2026-03-10 (post-class version)
- 2026-03-17 Large Speech Language Models
- Download the slides for the class on 2026-03-17 (pre-class version)
- 2026-03-24 Beyond Text-to-Speech (cloning, conversion, anonymisation,…)
- Status: slides not ready
Demo pages
- Example audio codec: SoundStream
- Example Large Speech Language Models:
- Shannon Text Generator
- This is just a character N-gram trained on a small amount of text, not an LLM!
- Try training it on natural language from different domains, or even with Python code.
- Example speech editing model: VoiceCraft
- Example Voice Conversion models: