Usage of TD-PSOLA

This topic has 1 reply, 2 voices, and was last updated 8 years, 4 months ago by Simon King.

Viewing 1 reply thread

Author

Posts
- October 16, 2017 at 14:18 #7932
  Kurt A
  Student
  I was wondering how commonly TD-PSOLA is used in speech processing and if there were better alternatives to it. I ask because TD-PSOLA seems too crude to have any practical use. Wouldn’t it make more sense to, for instance, identify the spectral envelope of a speech signal and feed it a different F0 to produce a new spectrum?
- October 17, 2017 at 09:05 #7933
  Simon King
  Professor
  Yes, TD-PSOLA is still used for speech modification – for relatively small changes in duration or F0, it gives very high quality (if implemented carefully).
  
  You’re right that TD-PSOLA looks “crude”. I’d prefer to say that it is deceptively simple and really quite elegant, once you deeply understand what it is doing. It’s actually an implicit source-filter separation, but can only modify the source (i.e., duration and F0). It cannot modify the filter.
  
  Do you understand where the filter is in TD-PSOLA? Why is it not possible to modify it?
  
  Your alternative suggestion is spot on: to obtain the spectral envelope and to “feed it” (we usually say “excite it”) with a new source signal of the desired F0. This is precisely what an explicit source-filter model can do. Linear prediction is a common choice for the filter.
  
  You suggest feeding the filter with a “different F0”. Given that the source-filter model operates in the time domain, what exactly would a “different F0” mean? Can you draw a diagram?
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

Usage of TD-PSOLA

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis