› Forums › Speech Synthesis › Unit selection › The cost functions: when in the pipeline?
- This topic has 3 replies, 2 voices, and was last updated 8 years, 4 months ago by Simon.
-
AuthorPosts
-
-
March 26, 2016 at 17:28 #2874
Reviewing your lectures, the forum, Taylor, and Hunt and Black’s original paper. I am confused about one thing. I think we can say that in Festival, the ‘Observation Pruning’ is the same idea as what Taylor calls ‘pre-selection’, whereby we reduce the number of candidates for each unit BEFORE we run a Viterbi search. OK, that’s reasonable. BUT, as Taylor says, ‘The target function will return a score or cost for all the units of the base type, and thereby create a ranked list’. OK, great – a ranked list. BUT – this means that the ‘target function’ is being run over all the candidates, and calculating their target costs BEFORE the Viterbi search. I was under the impression, from your lectures and from Hunt and Black’s algorithm, that all the calculating of target and join costs was done DURING the search itself. But in order to do Observation Pruning (Pre-Selecting), the target cost must be computed BEFORE the Viterbi search. I’m trying to understand the whole process as a pipeline of sequential sub-processes. Can you help me understand this sequence of events better? And is it different for target and join costs? What exactly is calculated BEFORE the Viterbi search, and what is calculated DURING the Viterbi search? And for the ‘before’ part, what do we call that process?
-
March 26, 2016 at 19:22 #2875
We can compute the target cost of all candidates before the search commences. As you say, this would be necessary if we want to do some pruning based only on the target cost.
We could also pre-compute all the join costs too, after the candidate lists are pruned, but before search commences. That might be wasteful, because if we use pruning during the search, then some joins may never be considered. So, computing join costs during the search would be more sensible.
My names for the processes that happen before search commences:
pre-selection: the process for retrieving an initial list of candidates (per target position) from the inventory; in Festival, this means retrieving all units that match in diphone type
pruning: reducing the number of candidates (per target position), perhaps on the basis of their target costs
-
March 26, 2016 at 20:42 #2876
Thanks. I’m still missing one piece of the puzzle. Is there not then a ‘process’ that we might call ‘running the target cost function’ over the initial list of candidates (per target position) that returns target costs and attaches them to each unit in the list? Or do you not think of that as a distinct process?
Lastly, can you say what pre-search pruning method Festival/Multisyn is using with its ‘ob_pruning’ function? Is it based on target cost?
-
March 27, 2016 at 11:58 #2877
Yes, we can say that one step in the synthesis process is to compute the target cost for every candidate at every target position.
After that, Festival performs “observation pruning”, which is pruning of the candidate lists based on their target costs.
It is desirable to make the candidate lists shorter before the search commences, because this dramatically reduces the number of join costs that need to be computed (which is proportional to the average number of candidates per target position squared). Halving the average number of candidates thus cuts the number of join costs to be computed by 75%.
[Aside: the reason that the term ‘observation’ is used is as follows: if we conceive of the search as equivalent to an HMM, then the target cost takes the place of the observation probability, and the join cost takes the place of the transition probability.]
-
-
AuthorPosts
- You must be logged in to reply to this topic.