Taylor – Text-to-speech synthesis

Definitive, authoritative and comprehensive. The best available book on text-to-speech synthesis.

Paul Taylor “Text-to-speech synthesis”, 2009, Cambridge University Press, Cambridge, ISBN 0521899273

Taylor - Chapter 3 - The text-to-speech problem
Discusses the differences between spoken and written forms of language, and describes the structure of a typical TTS system.
Taylor - Chapter 4 - Text Processing
Complementary to Jurafsky & Martin, Section 8.1.
Taylor - Chapter 5 - Text decoding
Complementary to Jurafsky & Martin, Section 8.1.
Taylor - Chapter 6 - Prosody prediction from text
Predicting phrasing, prominence, intonation and tune, from text input.
Taylor - Chapter 8 - Pronunciation
Including how the lexicon is stored, letter-to-sound, and compressing the lexicon.
Taylor - Chapter 10 - Signals and filters
Focus on the concepts and diagrams, and don't worry about understanding the maths too much.More...
- Taylor - Section 10.1 - Analogue signals
  It's easier to start by understanding physical signals - which are analogue - before we then approximate them digitally.
- Taylor - Section 10.2 - Digital signals
  Going digital involves approximations in the way an original analogue signal is represented.
Taylor - Chapter 12 - Analysis of speech signals
Includes spectral envelope extraction (cepstrum or LPC), source representation (the residual), pitch tracking and pitch marking.More...
- Taylor - Section 12.3 - The cepstrum
  By using the logarithm to convert a multiplication into a sum, the cepstrum separates the source and filter components of…
- Taylor - Section 12.4 - Linear-Prediction Analysis
  An overview of the background and maths behind linear-prediction methods for modelling the vocal tract as a filter.
- Taylor - Section 12.7 - Pitch and epoch detection
  Only an outline of the main approaches, with little technical detail. Useful as a summary of why these tasks are…
Taylor - Chapter 15 - Hidden-Markov-model synthesis
Written with a traditional "starting from automatic speech recognition" viewpoint, you will need to make the connections for yourself to the more general concept of text-to-speech as a regression problem.
Taylor - Chapter 16 - Unit-selection synthesis
A substantial chapter covering target cost, join cost and search.
Taylor - Chapter 17 - Further issues
Databases, evaluation, audio-visual synthesis, expressive speechMore...
- Taylor - Section 17.1 - Databases
  Including the important issue of labelling the data
- Taylor - Section 17.2 - Evaluation
  Testing of the system by the developers, as well as via listening tests.

Taylor – Text-to-speech synthesis

Taylor - Chapter 3 - The text-to-speech problem

Taylor - Chapter 4 - Text Processing

Taylor - Chapter 5 - Text decoding

Taylor - Chapter 6 - Prosody prediction from text

Taylor - Chapter 8 - Pronunciation

Taylor - Chapter 10 - Signals and filters

Taylor - Section 10.1 - Analogue signals

Taylor - Section 10.2 - Digital signals

Taylor - Chapter 12 - Analysis of speech signals

Taylor - Section 12.3 - The cepstrum

Taylor - Section 12.4 - Linear-Prediction Analysis

Taylor - Section 12.7 - Pitch and epoch detection

Taylor - Chapter 15 - Hidden-Markov-model synthesis

Taylor - Chapter 16 - Unit-selection synthesis

Taylor - Chapter 17 - Further issues

Taylor - Section 17.1 - Databases

Taylor - Section 17.2 - Evaluation

Search this site

Posts

Latest Activity

Search the forums