› Forums › Speech Synthesis › HMM synthesis › Deltas
- This topic has 3 replies, 2 voices, and was last updated 8 years, 5 months ago by Simon.
-
AuthorPosts
-
-
March 8, 2016 at 20:54 #2714
Is the delta of F0 the rate of change within a frame or across frames? (I thought it’ a way to compensate the decorrelation between feature vectors, so it should be across frames. Right?)
-
March 8, 2016 at 21:00 #2715
Deltas (of any parameter, including F0) are always computed using more than one frame. There is no way to compute them from a single frame, because there is only a single value (of F0, say) to work from.
Minimally, we need the current frame and one adjacent frame (previous or next) to compute the delta – in this case, it would simply be the difference between the two frames (the value in one frame minus the value in the other frame). It is actually more common to compute the deltas across several frames, centred on the current frame.
Adding deltas is a way to compensate for the frame-wise independence assumption that is made by the HMM. However, in synthesis, we also need them as a constraint on trajectory generation at synthesis time.
-
March 8, 2016 at 21:44 #2716
Now I see!
So the reason why the deltas are produced from a Guassian distribution is because we take into account all the frames produced by the state and produce the mean delta. Right? (I assume that we will also consider examples from all the clustered states under this leaf node as well. Is it correct?)
Thank you!
-
March 9, 2016 at 12:45 #2717
Not quite right, no.
Let’s separate out the three stages
1. preparing the data
deltas are computed from the so-called ‘static’ parameters, as explained above (e.g., simple difference between consecutive frames) – this is a simple deterministic process
2. training the model
the ‘static’ and delta parameters are now components of the same observation vector of the HMMs, which is modelled with multivariate Gaussians; the fact that one part of the observation vector contains the deltas of another other part is not taken into consideration (*)
3. generation
MLPG finds the most likely trajectory, given the statics and deltas – think of the deltas as constraints on how fast the trajectory moves from the static of one state to the static of the next state
(*) there are more advanced training algorithms that respect the relationship between statics and deltas – we don’t really need to know about that here
-
-
AuthorPosts
- You must be logged in to reply to this topic.