› Forums › Speech Synthesis › Unit selection › Viterbi Optimisation
- This topic has 1 reply, 2 voices, and was last updated 2 years ago by Korin Richmond.
-
AuthorPosts
-
-
April 9, 2022 at 18:43 #15872
I decided to try and do a kind of parameter tuning on the Viterbi search as a part of my paper. I created a bunch of scripts from a template where each script would set the beam widths of both the candidate pool and the search beam to a certain size, and then synthesize about 400 utterances. I ran these scripts in a loop and recorded how long it took each script to run (using the built-in bash time function). I also saved off the ‘Unit relations for analysis.
My results however are completely different than what I would expect, and I’m not sure what to make of it. Although join costs seem to reflect what I would expect (a narrow beam throws away candidates that ‘get good’ later on and so costs are high), the time reported seems to have little relation to the beam widths (though I had assumed that a smaller beam means a faster search).
For example:
Candidate beam 0.2 + Search beam 0.1: user time 9.920
Candidate beam 0.2 + Search beam 0.5: user time 9.343I do not know why this is the case or how to explain it.
-
April 12, 2022 at 10:20 #15883
Candidate beam width dictates how many candidate units will be considered for each target unit (candidates units with target cost outside the beam width will be dropped), while the other beam width dictates how many viterbi paths are kept alive at each point (paths with a total score outwith the beam width of the best one are dropped).
One explanation for the smaller than expected time difference between the two pruning conditions you give could be that aggressive candidate pruning means there are fewer opportunities for path pruning? We’d just need to know how many paths are considered at each point for the two conditions to properly understand what’s happening here.
-
-
AuthorPosts
- You must be logged in to reply to this topic.