Forum Replies Created
-
AuthorPosts
-
Festival alone cannot actually automatically label files. All it can do is process text through the front end to get a linguistic specification, which includes the sequence of phones.
The alignment is generally done using HMMs, often with the HTK toolkit.
For a language not supported by Festival, you need to use a TTS front-end, or be able to convert text into a string of phones some other way (e.g., by dictionary lookup). After that, the alignment step is the same as for English.
Results of the poll: of students who expressed a preference, 73% prefer the current room with its arrangement around group tables.
You could imagine doing speech recognition by measuring the cosine similarity between feature vectors. But this is not the usual way.
We typically use a generative model (the Gaussian, or Normal, probability density function) of feature vectors, within a generative model of sequences (a Hidden Markov Model).
Great question! Fourier analysis decomposes any signal into a sum of simple signals (called base functions): sine waves, each with a frequency, magnitude and phase.
Since sine waves are periodic, Fourier analysis can surely only be applied to periodic signals, can’t it? Correct. At least, only to signals that we assume are periodic.
Short-term analysis
For a signal such as speech, where the spectral envelope changes over time, we must always use short-term analysis techniques. That means taking a frame of the signal (typically 25ms) and making some assumptions about the signal within that frame.
We will assume that the spectrum doesn’t change at all within the frame: the signal is “stationary“.
Assumption that the signal is periodic
To apply Fourier analysis, we make another assumption: the signal is periodic. In the case of short term analysis, the Fourier analysis effectively assumes that the frame of signal is repeated over and over before and after the frame.
for sounds like fricatives, we effectively turn them into signals that repeat with a period of one frame. Since the frequency resolution of the Fourier transform is limited by the duration of the frame, we don’t actually see this “assumed periodicity” in the resulting spectrum: it’s at a frequency lower than we can resolve.
Videos would be easier to follow if we could see your face
In previous years, students said that seeing me (using “picture-in-picture”) was distracting and they preferred just the slides plus my live annotations.
There are challenges in making videos in which you see the slides and annotations, plus my face. This is because I cannot look into a camera at the same time as looking at the screen that I am annotating.
It would only be possible to include a high-quality video of my face with custom-made videos (i.e., ones I make at home, not in a live lecture). I will experiment with this, but it won’t be done this semester.
I may poll you later regarding different formats of video with/without my face and so on.
Provide explicit links from lab exercises to the corresponding theory
For now, I am not going to do this, for the same reasons as above: I want you to develop as active learners. The act of finding those connections for yourself is part of the learning process. Giving you all the answers “on a plate” would encourage passive learning, and so would be less effective.
Of course, if you are completely lost and cannot see where to start, then I have failed. If that’s the case, you must tell me.
As above, keep giving me feedback on this: am I finding a good balance between giving you comprehensive material versus encouraging active learning?
Not enough time is spent on the answers to TopHat questions
On the other hand, some students thought we should not spend too much time within the 50 minutes of a lecture on TopHat.
I hope that releasing the questions to you for review after the lecture will mitigate the limited time we can spend on the answers within the lecture.
If 70% of students get a question correct, but 30% get it wrong, it’s hard to decide whether to spend a long time on the solution within the lecture. However, of course I want to help those 30% of students understand why they got it wrong.
Keep giving me feedback on this, and we can try variations in the lectures to see what works best overall.
Provide direct links within the calendar
The calendar is implemented as a Google calendar, pulled in to the website as a feed. That restricts the entries to have only plain text titles and descriptions. However, an advantage is that you can all subscribe to this feed.
Changing the calendar method is a relatively major change, and so I won’t be doing it in the middle of the course.
From the results of a poll, it looks like only a minority of people subscribed to the calendar feed. So, in future, I could implement the course calendar differently, which will allow me to include clickable links into the course content.
Automatic notification of changes on the website
Currently, you can receive notifications of forum posts and replies (not only to your own posts), simply by subscribing to each top-level forum that you’re interested in (e.g., http://www.speech.zone/forums/forum/readings ).
Notifications of other changes are not currently available, I’m afraid.
Provide a hub page for the course
The intention is that the left menu (when viewed on a computer, or in landscape mode on an iPad) shows you the complete course structure, with colour used to highlight where you currently are.
I may use a poll later to ask what other functionality would be helpful. For example, a “mark as completed” facility for logged-in users, or “show me what to do next”. These would probably need custom coding in WordPress, which I can do but would take time.
Some videos don’t capture everything on the screen
Yes, unfortunately some of the lectures were captured using an older university system which didn’t always correctly switch between inputs. Also, I used to teach with a dual screen setup – this was nice in a live situation, but not helpful for lecture recording.
If you use the forum to flag video clips which are hard to follow for this reason, then I will prioritise those for improvement (within this semester).
Place the readings above the videos on each page
I’ve now started interleaving the readings and videos in a suggested order, rather than always having a video first, and readings afterwards.
But … see the previous post about being an active learner: you can find your own route through the material.
Videos sometime seem a little disjointed
Yes, that’s to be expected if you watch the videos passively; you will not get the most out of this course this way.
The course comprises videos, readings, blog posts, forums and lab exercises, plus of course the lectures. Part of your job as active learners is to integrate the information across these multiple modes of delivery.
I am hoping that asking you to find some of the connections for yourself encourages this active behaviour, and ultimately you’ll arrive at a deeper understanding of the material.
The flipped approach makes learning more effective, and hopefully more engaging and fun, but it does rely on you putting in this little extra effort.
Ongoing feedback about this aspect of the course is particularly welcome.
I agree that some sequences of videos, which are currently created from edited recorded lectures, do have gaps and discontinuities. I may poll you later in the semester about whether custom-made videos (as in some blog posts) would be better.
Reading lists for each lecture
(in addition to the readings specified per topic / per video)
Done! Find them within each course.
Video speed control
Already answered on the forums.
-
AuthorPosts