Accounting for low F0 bias in autocorrelation

This topic has 1 reply, 2 voices, and was last updated 9 years ago by Simon.

Viewing 1 reply thread

Author

Posts
- February 20, 2016 at 07:32 #2617
  Verity S
  Student
  Given that cross-correlation is less computationally efficient due to the need to retain more data in memory, is there some way of normalising the autocorrelation to account for the low F0 bias that maintains its computational simplicity, for example by dividing by the number of samples used (W-T)?
- February 21, 2016 at 12:37 #2629
  Simon
  Professor
  Most algorithms would use cross-correlation (also known as modified auto-correlation), even if it does need a little bit more computation. In speech synthesis, F0 estimation is typically a one-time process that happens during voice building and so we don’t care too much about a little extra computational cost, if that gives a better result.
  
  I think when you say “low F0 bias” you mean a bias towards picking peaks at smaller lags. That would be a bias towards picking higher values of F0. For example, we might accidentally pick a peak that corresponds to F1 in some cases.
  
  The YIN pitch tracker (or open access version) performs a transformation (look for “Cumulative mean normalized difference function” in the paper) of the cross-correlation function, to avoid picking F1.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.