Watts et al. Where do the improvements come from in sequence-to-sequence neural TTS?

A systematic investigation of the benefits of moving from frame-by-frame models to sequence-to-sequence models.

Oliver Watts, Gustav Eje Henter, Jason Fong, Cassia Valentini-Botinhao. “Where do the improvements come from in sequence-to-sequence neural TTS?” in Proc. 10th ISCA Speech Synthesis Workshop, 217-222. DOI:10.21437/SSW.2019-39