Forum Replies Created
-
AuthorPosts
-
Hi Xueyan,
Many apologies for this. It appears the sound wasn’t recorded in the first half. I’m not sure why. I’m fairly certain I turned the microphone on but may have done something while setting up.
If you have any questions about the lecture, feel free to ask here or in office hours!
Apologies again,
CatherineHi Ha Anh,
You should be able to see the answers now!
cheers,
CatherineHi Yujia,
Grapheme-to-phoneme is more specific that “sounds” aren’t always the same as phonemes. But in general you can consider them the same thing: in practice both refer to methods to map from written text to pronunciations. You’ll see grapheme-to-phoneme (or g2p) much more often in the recent literature.
cheers,
CatherineHi Claire,
1. {exam number}_{lab report word count} should be the name of the file you submit. You should use another title for the title of the actual report. For assignment 1, something like “Speech Processing Assignment 1” is fine. Though, do note for assignment 2, we’ll be expecting a more informative title.
2. No, you don’t need a table of contents for assignment 1. But do use the section headings noted in the instructions.
cheers,
CatherineHi Iakovi,
At this point it looks like the PPLS AT lab won’t be accessible on the weekends We’re looking in whether we can get access for you guys but can’t guaranteed anything – sorry!
best wishes,
CatherineSorry! I think I still had the settings wrong! Hopefully it should work now?
cheers,
CatherineHi Ha Ahn,
[just copying this from another thread since I think it’s the same problem!]
I think the scp (i.e., file transfer) server is down. I’ll put in a ticket about this, but in the meantime you can rsync from the remote desktop machines instead (they all get you to the same files system!).
So, you’d just need to replace scp1.ppls.ed.ac.uk with something like ppls-atl-1066.ppls.ed.ac.uk in your rsync call.
The list of remote desktop machines is here (you’ll need UoE VPN on to access):
https://resource.ppls.ed.ac.uk/whoson/atlab.phpcheers,
CatherineHi Ha Anh,
Yes, sorry I thought you could already see the results but forgot to select the option! You should be able to see your results per question now.
cheers,
CatherineHi Yujia,
Great question! In general, sound waves don’t have to be symmetric like sinusoids (e.g. sine waves). But what Fourier Analysis tells us is that we can approximate period waves that don’t look at all like sinsuoids, by adding together scaled and shifted sinusoids of different frequencies.
So, yes, if we want to make a spiky and non-symmetric waveform like and impulse train we actually need to use sinusoids of every frequency we have available to us (scaled and shifted to get rid of the negative amplitude bits and also all the curvy bits!).
An important thing to note here is that the impulse train of zeros and ones is really an idealised model of the source. Though we can use it to synthesize sppech sounds, the actual human vocal source is much more complicated. Actual glottal pulses don’t really look like a single spike, and they do produce negative pressure, but they still have a positive bias as they are not symmetric.
cheers,
CatherineHi Ha Anh,
Yes there’s be a practice quizzes for the other online tests. I’m aiming to have the one for test 2 out tomorrow! The one of the ASR part of the should follow not too long after that.
cheers,
CatherineHi Alex,
As far as I can tell, you can still submit after the deadline as long as you have started before. But to be honest, when it comes to things on Learn it’s better to be safe than sorry! I would aim to submit before 12 noon, if possible.
cheers,
CatherineOctober 17, 2023 at 12:11 in reply to: Multiple choice question on quiz only allows one to be selected #16964Also, I’ve redacted the quiz questions posted above as this is a timed test. Please don’t post test questions on the forum!
October 17, 2023 at 12:04 in reply to: Multiple choice question on quiz only allows one to be selected #16963Hi George (and everyone)
Sorry, about the confusion here. It’s unexpected behaviour from Learn. However, you can take that behaviour as a (unplanned) hint from Learn. Only selecting one option for those questions won’t hurt your grade!
cheers,
CatherineHi Xueyan,
You’re right, the lowest analysis frequency is DFT[1] (i.e., k=1), but the actual DFT outputs start at index k=0. We skipped over a bit in class, but essentially the DFT[0] represents the vertical “bias” of the waveform. This tells you whether the waveform in the input window is symmetric around 0 (in amplitude) or whether it’s shifted up (or down on average.
This bias terms doesn’t actually tell us what frequencies are present in the input, so we usually don’t think too much about it. But, you can see a difference in DFT[0] if you compare a sine wave that matches one of the analysis frequencies as input (which has amplitude ranging from -1 to +1) and an input window containing one impulse and the rest zeros (so has amplitude either 0 and +1). In this case, DFT[0] for the sine wave will be zero, while DFT[0] for the impulse will 1.
In terms of the mirroring, DFT[0] would mirror DFT[16] where the number of samples in the input window is N=16. But the DFT formulation doesn’t include this (k=0,…,N-1). But, like DFT[0], DFT[N] wouldn’t tell us anything about the frequency components in the input.
cheers,
CatherineHi Ha Anh,
This spectrogram was actually created by Rebekka, but from the general spectral characteristic, I’d say:
(1) is a fricative, (4) is a vowel-approximant-vowel sequence,
(10) is a vowel with glottalisation at the end (11) is probably an affricate (i.e. a plosive + fricative).But I’ll have to check with Rebekka what they actually are!
We didn’t do much spectrogram reading in this course, so we’d only ask you to recognise broad categories of speech sounds (with more easily recognisable acoustic features!).
cheers,
Catherine -
AuthorPosts