decision tree in ASF

This topic has 1 reply, 2 voices, and was last updated 7 years, 11 months ago by Simon.

Viewing 1 reply thread

Author

Posts
- April 6, 2017 at 23:01 #7055
  Xiao Z
  Student
  Hi Simon,
  
  I know we can use cepstral space to measure and get the “perceptual space”. But can we just use the data from the perception listening test of a specific language?
  
  Cuz I think speakers of different languages are sensitive to different things. (e.g. in one Chinese dialect, people cannot tell /n/ from /l/. In this case, we may put /n/ and /l/ together; while these two may have a big distance in cepstral space) Taylor in his book mentioned we don’t define an abstract perceptual space, but if we have enough data, can we do that?
- April 7, 2017 at 10:32 #7058
  Simon
  Professor
  Your proposal is to use perceptual data (i.e., from listening tests with human subjects) to define a target cost function. It’s a good idea, and has been tried, but it’s difficult to get enough perceptual data to automatically learn such a function.
  
  In the following paper, we describe a simple target cost function (in the form of a classifier) that is learned from perceptual data. It worked, but did not beat Festival’s standard IFF target cost function. Note that our novel target cost function is still using only linguistic features as input, and doesn’t use acoustic properties of the candidates.
  
  Volker Strom and Simon King. A classifier-based target cost for unit selection speech synthesis trained on perceptual data. In Proc. Interspeech, Makuhari, Japan, 2010.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.