speech synthesis module 05

King: Measuring a decade of progress in Text-to-Speech

A distillation of the key findings of the first 10 years of the Blizzard Challenge.

Clark et al: Statistical analysis of the Blizzard Challenge 2007 listening test results

Explains the types of statistical tests that are employed in the Blizzard Challenge. These are deliberately quite conservative. For example, MOS data is correctly treated as ordinal. Also includes a Multi-Dimensional Scaling (MDS) section that is not as widely used as the other types of analysis.

Norrenbrock et al: Quality prediction of synthesised speech…

Although standard speech quality measures such as PESQ do not work well for synthetic speech, specially constructed methods do work to some extent.

Mayo et al: Multidimensional scaling of listener responses to synthetic speech

Multi-dimensional scaling is a way to uncover the different perceptual dimensions that listeners use, when rating synthetic speech.

Benoît et al: The SUS test

A method for evaluating the intelligibility of synthetic speech, which avoids the ceiling effect.

Bennett: Large Scale Evaluation of Corpus-based Synthesisers

An analysis of the first Blizzard Challenge, which is an evaluation of speech synthesisers using a common database.

Taylor – Section 17.2 – Evaluation

Testing of the system by the developers, as well as via listening tests.

King: Measuring a decade of progress in Text-to-Speech

Clark et al: Statistical analysis of the Blizzard Challenge 2007 listening test results

Norrenbrock et al: Quality prediction of synthesised speech…

Mayo et al: Multidimensional scaling of listener responses to synthetic speech

Benoît et al: The SUS test

Bennett: Large Scale Evaluation of Corpus-based Synthesisers

Taylor – Section 17.2 – Evaluation

Search this site

Posts

Latest Activity

Search the forums