Speech Synthesis (up to 2015-16)

Archived course

The current version of this course has a better layout and much better video content.

Introduction
An introduction to what this course covers, how it is taught, a brief history lesson, and a survey of current issues in speech synthesis.More...
- Introduction to the course
  Course outline. A taster of what is to come, by listening to a variety of TTS systems.
- History
  A brief history of text-to-speech synthesis, to provide some context for the state-of-the art systems that this course will cover.
- Key challenges
  Taylor identifies the key challenges in text-to-speech. Of these, the generation of natural human-sounding speech is going to be the…
- Understanding the problem
  If we believe Taylor when he says we generally only need shallow processing of the text, then we can state…
- Looking ahead
  A very quick look at some interesting applications of TTS, to motivate the techniques that we will cover later in…
Unit selection
Unit selection: how waveform generation is achieved through selection and concatenation of waveform segments, the data required to do this, and the limitations of this approach.More...
- The method
  It seems simple: choose a suitable sequence of pre-recorded speech segments, and play them back in the right order. But…
- The database
  The quality of a unit selection system depends very much on the speech database, both the quality of the recorded…
Evaluation
How do we evaluate a speech synthesiser? Almost always, we will need to play samples of synthetic speech to listeners and obtain some response from them.More...
- Introduction
  It's probably obvious that we need to evaluate any speech synthesiser, but let's pause and ask why that is.
- Why evaluate?
  What are we trying to get our of our evaluation? Do we need to know how to improve the system,…
- What to evaluate?
  Depending on our goals, we may need to evaluate the whole end-to-end TTS system, or just some of its components.
- Which aspects?
  It's important to be very specific about which aspects of the system we are evaluating: do we want to measure…
- How to evaluate
  In general, we are going to need some listeners, but what exactly shall we have them do?
- Test design
  Careful design will make sure listeners do the task we want them to, and that there are no unwanted effects.
- Materials
  The choice of appropriate text materials needs to be guided by what we are trying to measure, and what kind…
Signal processing for speech synthesis
Before moving on to parametric speech synthesis, we need to learn more about signal processing. In particular, how can we represent speech as a set of parameters that are suitable for statistical modelling?More...
- F0 estimation
  A key parameter in any parametric representation of speech is the fundamental frequency, F0. Estimating it from speech is not…
- Vocoding
  In order to model speech, we need a parametric representation of it. This might be done using a source filter…
Statistical parametric speech synthesis
That's quite a mouthful, but we need to use a general term because this topic includes both Hidden Markov Models and Neural Networks for waveform generation.More...
- HMM-based synthesis
  Hidden Markov Models are generative models, although their most common application is classification (Automatic Speech Recognition). But, of course we…
- DNN-based synthesis
  In HMM-based speech synthesis, the hard work is done by a regression tree. Trees are rather naive models, so why…
Hybrid speech synthesis
There are various ways to combine the strengths of machine learning (to deal with data sparsity) and waveform concatenation (for highly natural-sounding speech), and these so-called hybrid methods can do that very effectively.More...
- Overview
  A first look at how we can combine generation from a statistical model with concatenation of waveforms.

Speech Synthesis (up to 2015-16)

Introduction

Introduction to the course

History

Key challenges

Understanding the problem

Looking ahead

Unit selection

The method

The database

Evaluation

Introduction

Why evaluate?

What to evaluate?

Which aspects?

How to evaluate

Test design

Materials

Signal processing for speech synthesis

F0 estimation

Vocoding

Statistical parametric speech synthesis

HMM-based synthesis

DNN-based synthesis

Hybrid speech synthesis

Overview

Search this site

Posts

Latest Activity

Search the forums