Advantages of Spontaneous Speech Database

This topic has 1 reply, 2 voices, and was last updated 10 years, 1 month ago by Simon King.

Viewing 1 reply thread

Author

Posts
- January 22, 2016 at 16:50 #2206
  Chiara
  Student
  What are the pros of building a database from spontaneous speech? I can think of two, and yet neither seems advantageous enough to make spontaneous speech preferable to recorded speech:
  – Data are easier to collect, so the database can be quite large. (but it also contains a lot of disfluencies,co-articulations, noise etc. which decreases the overall quality of the database)
  – “Interesting” variations in prosody – But without a solid way to model (and label) this prosodic variation, will it not just be a lot of extra, and even confusing, information?
  Are there any other significative advantages to the recording of spontaneous speech over studio-recorded one?
- January 24, 2016 at 17:27 #2323
  Simon King
  Professor
  Building synthetic voices from spontaneous speech is an area of active research.
  
  Although we might be able to gather a lot of spontaneous speech, one barrier is that we then have to manually transcribe it. The second barrier is that it is hard to align the phonetic sequence with the speech; this is for many of the same reasons that Automatic Speech Recognition of such speech is hard (you list some of them: disfluencies, co-articulations, deletions,…).
  
  The hypothesised advantage of using spontaneous speech, over read text, is that the voice would sound more natural.
  
  You put your finger on the core theoretical problem though: without a good model of the variation in spontaneous speech (including a predictive model of that variation given only text input), it is indeed just unwanted noise in the database.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Advantages of Spontaneous Speech Database

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis