Hunt and Black

This topic has 3 replies, 4 voices, and was last updated 4 years, 4 months ago by Korin Richmond.

Viewing 3 reply threads

Author

Posts
- January 30, 2017 at 21:00 #6636
  Dimitra L
  Student
  The system Hunt and Black are describing in the paper is actually the implementation of IFF ?
- January 31, 2017 at 19:44 #6638
  Simon
  Professor
  Hunt & Black use a combination of IFF and ASF target costs – see Section 2.2 of the paper.
- January 28, 2021 at 14:47 #13803
  Yuchen F
  Student
  Hi, I am still confused about this question.
  
  1. For the ASR part, I can’t find any evidence except feature vector. But it is vague since the features can be linguistic features or acoustic features.
  
  2. In the IFF part, are all the features they described belonging to linguistic features? I am wondering if point of articulation is an acoustic feature. Could you tell me more specific definitions of acoustic features and linguistic features?
  
  3. When using the combination of IFF and ASF, should we assign weights to both acoustic features and linguistic features?
  
  Thanks!
- January 29, 2021 at 17:32 #13808
  Korin Richmond
  Professor
  1. For the ASF features – pitch, power and duration are mentioned (“…each target phoneme has a target pitch, power and duration”)
  
  2. Yes, place of articulation is indeed a linguistic (i.e. articulatory phonetic) feature – it’s a property of a *phone*, which is a linguistic *concept* rather than a physically measurable signal, for example. In contrast, the f0 or power or duration used as ASF feature are something you can directly observe and measure in the acoustic signal (or derivations thereof).
  
  3. Yes, when combining different subcosts (e.g. ASF and/or IFF ones) we would typically want to weight them, so we can balance their influence in the overall cost.
Author

Posts

Viewing 3 reply threads

You must be logged in to reply to this topic.

Hunt and Black

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis