You can explore issues surrounding database design, recording and labelling for yourself, in the Build your own neural speech synthesiser exercise. You might even design your own text selection algorithm and record the script that it generates, comparing the resulting synthesis to that obtained when using the ARCTIC script.
Looking forward to the later part of the course, here are some examples of more recent databases that are used to train neural models:
- LJ Speech – a single-speaker corpus created from audiobook recordings
- Emilia – a pipeline that web-scrapes speech at scale (over 100k hours) and automatically curates it
Optionally, if you’d like to explore further issues, here are some starting points:

