Forum Replies Created
-
AuthorPosts
-
At what point do we apply a tapered window with TD-PSOLA – is it when the pitch periods are overlapped and added? Do we apply a tapered window around the pitch marks ?
With regard to POS tagging there are several different methods that could be used. Would it suffice to describe one in detail and mention the others in passing?
Struggling with this question a bit
b) What problems arise when using TD-PSOLA …
Here’s what I have so far:
TD-PSOLA requires accurate estimates of epochs (marked by pitch marks) in order to modify a waveform without changing phone identity. Moreover, TD-PSOLA produces artefacts in the waveform if a large change in F0 is required (I don’t understand why this is)
With linear predictive speech synthesis, we can solve for filter coefficients and find the exact filter for our waveform. This allows us to directly manipulate the spectral envelope without making any approximations.
I am sure I am missing a lot because this is a 20 mark question, but I am really stumped
I went over the TDPSOLA videos again but am still a bit stuck on this question in the paper:
The TD-PSOLA algorithm can be used modify the duration and fundamental frequency of a
speech waveform. Explain how TD-PSOLA can be used to :
(i) increase the duration of a speech waveform by a factor of 2, without changing the fundamental frequency;
(ii) decrease the fundamental frequency without changing the duration;Is it simply because of the mechanisms of the algorithm? i.e. (i) because we have made a join pitch synchronously and overlapped (there is more room to fit in pitch periods), we can add more pitch periods in without changing the F0
e.g. if we had a waveform of two frames and wanted to make it the duration of 4 – we extract the pitch periods from it (giving us the response to one impulse) and then overlap these (copy synthesis) so that we can add in 2 more pitch periods and increase duration. Frequency doesn’t change because you still have the same number of cycles per second.
(ii) we can decrease fundamental frequency without changing the duration for essentially the same reason – there is an overlap so we can move pitch periods further apart without changing duration
Am I missing anything here ?Assessment 3
online exam – exact format to be announced during the course
worth 50% of the final grade
date: in the December exam diet (10th to 21st December)
From the learn assesment info on learnSo the filter in TD-PSOLA is the impulse response of a diphone?
The commands are working now – I can set the utterance but when I type the command (utt.play myutt) I get the following error:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
festival> (utt.play myutt) -=-=-=-=-=- EST Error -=-=-=-=-=-
{FND} Feature Wave not definedOctober 10, 2020 at 11:45 in reply to: Clarify the Difference between the Filter and Output Response #12312So the magnitude spectrum (a.k.a spectral envelope) is about an overall pattern, ratios and scales. In contrast, the frequency response is about the components (different frequencies) involved. Might not work as an analogy but I was thinking it’s akin to two buildings. They both might have the same tower block structure, but one may be made of concrete, the other made of brick. To bring it back to the impulse train – ALL impulse trains have the same structure – periodic with frequencies at every harmonic – however these frequencies can be different (multiples of 100Hz vs 200Hz) – but the overall periodic pattern is the same
I got a new error
[atlab@localhost ~]$ $ sudo rsync -avul –progress –files-from=:/Volumes/Network/courses/sp/manifest.txt s1869308@scp1.ppls.ed.ac.uk:/ /
bash: $: command not found…
[atlab@localhost ~]$ sudo rsync -avul –progress –files-from=:/Volumes/Network/courses/sp/manifest.txt s1869308@scp1.ppls.ed.ac.uk:/ /
[sudo] password for atlab:
Sorry, try again.
[sudo] password for atlab:
ssh: Could not resolve hostname scp1.ppls.ed.ac.uk: Name or service not known
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.2]
[atlab@localhost ~]$This is the error I get now – although the VM appears to be connected to the VPN
Attachments:
You must be logged in to view attached files.Hello, I have had a look at the notebooks now and I understand that there are two main types of filters, the finite and the infinite impulse response and that the latter is more useful for simulating the vocal tract .
What I am unclear where we get the coefficients from in the operation to create the filter. How does this relate to the Fourier transform? Do we do the fourier transform first and then filter it? In the end I am struggling with how fourier is or isn’t related to filtering.
e.g – we want to recreate a sound [a] and so we use and IIR filter x by the impulse train – what does fourier then give us on top of this – why not use the fourier series when this series tells us the amount of each frequency in the signal?I think it’s just an example to demonstrate why the Nyquist frequency is the limit. In practical terms , you wouldn’t choose a sampling frequency above the Nyquist frequency because of aliasing, as you said. But I might be wrong too.
So it is kind of like how two different sounds with different phases can show up the same on a magnitude spectrum – the sounds are still different structurally, but on a surface level we hear them the same because our hearing cannot make this distinction
Might be a bad analogy, but could you compare it to a recipe? As in if we don’t get the information about the time each ingredient was added (phase) we won’t reconstruct the original recipe (some waves waves might cancel each other out if we don’t phase shift)? Also, am I correct in thinking that a phase shift is equivalent to a time shift, because if the phase angle is shifted backwards, we will reach the end of that cycle at a later point.
ThanksHaving read through the rest of the tutorials, I still don’t understand why the phase shift is a necessary component of the Fourier transform. I know it’s a pretty basic component of it but can’t seem to wrap my head around it. Do we phase shift to make sure that the phasor moves around the unit circle at the right frequency?
-
AuthorPosts