reading

Zen et al: Statistical parametric speech synthesis using deep neural networks

The first paper that re-introduced the use of (Deep) Neural Networks in speech synthesis.

Wu et al. Merlin: An Open Source Neural Network Speech Synthesis System

Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It is a typical frame-by-frame approach, pre-dating sequence-to-sequence models.

Wu et al: Deep neural networks employing Multi-Task Learning…

Some straightforward, but effective techniques to improve the performance of speech synthesis using simple feedforward networks.

Watts et al: From HMMs to DNNs: where do the improvements come from?

Measures the relative contributions of the key differences in the regression model, state vs. frame predictions, and separate vs. combined stream predictions.

Nielsen: Neural Networks and Deep Learning

A great introduction. Relatively light on maths, and with some interactive explanations.

Ling et al: Deep Learning for Acoustic Modeling in Parametric Speech Generation

A key review article.

Gurney: An introduction to neural networks

Somewhat old, but might be helpful in getting some of the basic concepts clear, if you find Nielsen’s “Neural Networks and Deep Learning” too difficult to start with.

Zen et al: Statistical parametric speech synthesis using deep neural networks

Wu et al. Merlin: An Open Source Neural Network Speech Synthesis System

Wu et al: Deep neural networks employing Multi-Task Learning…

Watts et al: From HMMs to DNNs: where do the improvements come from?

Nielsen: Neural Networks and Deep Learning

Ling et al: Deep Learning for Acoustic Modeling in Parametric Speech Generation

Gurney: An introduction to neural networks

Search this site

Posts

Latest Activity

Search the forums