› Forums › Speech Synthesis › F0 estimation and epoch detection › Dudley's pitch detector
- This topic has 1 reply, 2 voices, and was last updated 9 years ago by
Simon.
-
AuthorPosts
-
-
February 20, 2016 at 15:29 #2618
There is an early pitch detecting algorithm mentioned in the Gold, Morgan and Ellis book which I do not understand. In chapter 31.3 they say: “A simple way to extract the fundamental period is shown in Fig. 31.3. First, the positive peaks of the signal are found; this is followed by the detection algorithm shown in the figure.”
Trying to interpret the illustration I concluded that detection can only occur at the exponential decaying part of the threshold, but I do not know what determines the blanking time. It also not clear to me why can we be certain that the first peak to cross the threshold will correspond to the F0 component.Attachments:
You must be logged in to view attached files. -
February 21, 2016 at 12:20 #2628
This is a really ‘old school’ type of signal processing, from the days when the implementation would be in analogue hardware and would have to be causal (i.e., cannot look ahead at the rest of the signal) and real-time, by definition.
You are correct that detection can only happen in the exponential decaying part. The blanking time is there to prevent any detections in the short period of time after the previous detection. It is a threshold on the minimum fundamental period that can be detected (i.e., it determines the maximum F0 that can be detected). The blanking time is a parameter of the method and will have to be set by the designer.
We cannot be certain that the first peak to cross the threshold will correspond to the F0 component. So, we do not expect this method to be very robust. I’m sure we could carefully tune the blanking time and the slope of the exponential decay to make it work in some cases, but it would probably be hard to find values for those two parameters that work for a wide variety of voices.
-
-
AuthorPosts
- You must be logged in to reply to this topic.