- This topic has 1 reply, 2 voices, and was last updated 8 years, 7 months ago by .
Viewing 1 reply thread
Viewing 1 reply thread
- You must be logged in to reply to this topic.
› Forums › Speech Synthesis › Unit selection › Decision-tree target cost function
We can train a decision tree on the unit database resulting in a partition of the database according to the features used in the linguistic specification.
When it comes to selecting a unit for a given linguistic specification, we traverse the tree according to the features in the linguistic specification.
The cluster we arrive at can either be treated as a candidate set of units which equally minimise the target cost, or we can select the one with minimal deviance from the mean for that cluster. Or even fit a Gaussian / series of Gaussians in a HMM and select the unit with highest probability given this model.
Is my understanding here correct?
Yes – spot on.
The decision tree is in fact performing a regression or classification task. Given the linguistic features, it is predicting which units in the database would be suitable to use for synthesising the current target position.
If we think of the tree as providing one or more candidate units at each leaf, it is performing classification.
We can also think of it as a regression tree that is predicting an acoustic specification, represented either as a set of exemplar units or (as you say) a probability density. The latter is how HMM-based speech synthesis works.
Some forums are only available if you are logged in. Searching will only return results from those forums if you log in.
Copyright © 2024 · Balance Child Theme on Genesis Framework · WordPress · Log in