Explains the types of statistical tests that are employed in the Blizzard Challenge. These are deliberately quite conservative. For example, MOS data is correctly treated as ordinal. Also includes a Multi-Dimensional Scaling (MDS) section that is not as widely used as the other types of analysis.
Robert A. J. Clark, Monika Podsiadło, Mark Fraser, Catherine Mayo, and Simon King. “Statistical analysis of the Blizzard Challenge 2007 listening test results.” In Proc. Blizzard 2007 (in Proc. Sixth ISCA Workshop on Speech Synthesis), Bonn, Germany, August 2007.
Publisher’s version (preferred)