Forum Replies Created

Viewing 15 posts - 1 through 15 (of 17 total)

1 2 →

Author

Posts
February 20, 2025 at 12:19 in reply to: More space needed #18273
Isobel W
Student
I tried the fixes suggested for this error but I’m encountering another error as shown in the screenshot. I changed permissions and I also closed everything except the terminal but the same thing still comes up.

Attachments:
You must be logged in to view attached files.
December 14, 2020 at 13:28 in reply to: Part II Question 1 #13654
Isobel W
Student
At what point do we apply a tapered window with TD-PSOLA – is it when the pitch periods are overlapped and added? Do we apply a tapered window around the pitch marks ?
December 14, 2020 at 06:31 in reply to: Part II Question 2 #13638
Isobel W
Student
With regard to POS tagging there are several different methods that could be used. Would it suffice to describe one in detail and mention the others in passing?
December 12, 2020 at 18:04 in reply to: Part II Question 1 #13577
Isobel W
Student
Struggling with this question a bit

b) What problems arise when using TD-PSOLA …

Here’s what I have so far:

TD-PSOLA requires accurate estimates of epochs (marked by pitch marks) in order to modify a waveform without changing phone identity. Moreover, TD-PSOLA produces artefacts in the waveform if a large change in F0 is required (I don’t understand why this is)

With linear predictive speech synthesis, we can solve for filter coefficients and find the exact filter for our waveform. This allows us to directly manipulate the spectral envelope without making any approximations.

I am sure I am missing a lot because this is a 20 mark question, but I am really stumped
December 12, 2020 at 18:02 in reply to: Part II Question 1 #13574
Isobel W
Student
I went over the TDPSOLA videos again but am still a bit stuck on this question in the paper:
The TD-PSOLA algorithm can be used modify the duration and fundamental frequency of a
speech waveform. Explain how TD-PSOLA can be used to :
(i) increase the duration of a speech waveform by a factor of 2, without changing the fundamental frequency;
(ii) decrease the fundamental frequency without changing the duration;

Is it simply because of the mechanisms of the algorithm? i.e. (i) because we have made a join pitch synchronously and overlapped (there is more room to fit in pitch periods), we can add more pitch periods in without changing the F0
e.g. if we had a waveform of two frames and wanted to make it the duration of 4 – we extract the pitch periods from it (giving us the response to one impulse) and then overlap these (copy synthesis) so that we can add in 2 more pitch periods and increase duration. Frequency doesn’t change because you still have the same number of cycles per second.
(ii) we can decrease fundamental frequency without changing the duration for essentially the same reason – there is an overlap so we can move pitch periods further apart without changing duration
Am I missing anything here ?
October 28, 2020 at 11:58 in reply to: Final exam #12767
Isobel W
Student
Assessment 3
online exam – exact format to be announced during the course
worth 50% of the final grade
date: in the December exam diet (10th to 21st December)
From the learn assesment info on learn
October 22, 2020 at 14:43 in reply to: Filter Confusion #12659
Isobel W
Student
So the filter in TD-PSOLA is the impulse response of a diphone?
October 15, 2020 at 08:33 in reply to: Festival Commands not Working #12482
Isobel W
Student
The commands are working now – I can set the utterance but when I type the command (utt.play myutt) I get the following error:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
festival> (utt.play myutt) -=-=-=-=-=- EST Error -=-=-=-=-=-
{FND} Feature Wave not defined
October 10, 2020 at 11:45 in reply to: Clarify the Difference between the Filter and Output Response #12312
Isobel W
Student
So the magnitude spectrum (a.k.a spectral envelope) is about an overall pattern, ratios and scales. In contrast, the frequency response is about the components (different frequencies) involved. Might not work as an analogy but I was thinking it’s akin to two buildings. They both might have the same tower block structure, but one may be made of concrete, the other made of brick. To bring it back to the impulse train – ALL impulse trains have the same structure – periodic with frequencies at every harmonic – however these frequencies can be different (multiples of 100Hz vs 200Hz) – but the overall periodic pattern is the same
October 8, 2020 at 08:50 in reply to: Connecting to SSL VPN through the VM #12286
Isobel W
Student
I got a new error
[atlab@localhost ~]$ $ sudo rsync -avul –progress –files-from=:/Volumes/Network/courses/sp/manifest.txt s1869308@scp1.ppls.ed.ac.uk:/ /
bash: $: command not found…
[atlab@localhost ~]$ sudo rsync -avul –progress –files-from=:/Volumes/Network/courses/sp/manifest.txt s1869308@scp1.ppls.ed.ac.uk:/ /
[sudo] password for atlab:
Sorry, try again.
[sudo] password for atlab:
ssh: Could not resolve hostname scp1.ppls.ed.ac.uk: Name or service not known
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.2]
[atlab@localhost ~]$
October 8, 2020 at 07:34 in reply to: Connecting to SSL VPN through the VM #12282
Isobel W
Student
This is the error I get now – although the VM appears to be connected to the VPN

Attachments:
You must be logged in to view attached files.
October 2, 2020 at 18:57 in reply to: Source Filter Model Query #12172
Isobel W
Student
Hello, I have had a look at the notebooks now and I understand that there are two main types of filters, the finite and the infinite impulse response and that the latter is more useful for simulating the vocal tract .
What I am unclear where we get the coefficients from in the operation to create the filter. How does this relate to the Fourier transform? Do we do the fourier transform first and then filter it? In the end I am struggling with how fourier is or isn’t related to filtering.
e.g – we want to recreate a sound [a] and so we use and IIR filter x by the impulse train – what does fourier then give us on top of this – why not use the fourier series when this series tells us the amount of each frequency in the signal?
October 1, 2020 at 12:36 in reply to: Sampling and Nyquist Frequency #12160
Isobel W
Student
I think it’s just an example to demonstrate why the Nyquist frequency is the limit. In practical terms , you wouldn’t choose a sampling frequency above the Nyquist frequency because of aliasing, as you said. But I might be wrong too.
September 29, 2020 at 11:01 in reply to: Phase Shift Query #12057
Isobel W
Student
So it is kind of like how two different sounds with different phases can show up the same on a magnitude spectrum – the sounds are still different structurally, but on a surface level we hear them the same because our hearing cannot make this distinction
September 29, 2020 at 08:52 in reply to: Phase Shift Query #12054
Isobel W
Student
Might be a bad analogy, but could you compare it to a recipe? As in if we don’t get the information about the time each ingredient was added (phase) we won’t reconstruct the original recipe (some waves waves might cancel each other out if we don’t phase shift)? Also, am I correct in thinking that a phase shift is equivalent to a time shift, because if the phase angle is shifted backwards, we will reach the end of that cycle at a later point.
Thanks
Author

Posts

Viewing 15 posts - 1 through 15 (of 17 total)

1 2 →

Isobel W

Forum Replies Created

Attachments:

Attachments:

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis