Getting back on track with the assignment

Information about completing the assignment under the current circumstances, including revised expectations about what you will be able to achieve.

Revised assessment arrangements

In the light of an announcement from the School about changes to assessment, here is a definition of the assessment for Speech Synthesis this year, for both levels 10 and 11:

The coursework is the single item of assessment. There will be no exam for level 10.
~~The coursework will receive a pass/fail mark but no grade.~~
2020-03-29 UPDATE: After initially announcing the change to pass/fail, the School was required by the University to revise its decision. The coursework will be graded on a simplified marking scale and feedback will be provided:
- 2020-04-02 UPDATE: Level 10 – 35/45/55/65/75/85
- 2020-04-02 UPDATE: Level 11 – 35/45/55/65/75/85
The due date has been revised to Thursday 23rd April at 12 noon
There is no change to the word limit or other requirements for the report.

The coursework is designed to support your learning, so we will be delighted to support those of you wishing to complete the coursework as originally designed, to a high standard, with a comprehensive and high-quality report. In return, we will provide comprehensive feedback on your work. Nevertheless, students facing external challenges may wish to scale back and submit a report that is adequate for a pass mark, by meeting the minimum expectations below.

2020-03-29 UPDATE: Please refer to guidance from the University and the School regarding the “no detriment” arrangements and the following course-specific advice for students needing to scale back and submit adequate work:

2020-03-29 UPDATE: Level 10 minimum expectations:

A mark of 45 can be obtained by building one voice using your own recordings of ARCTIC A and performing a limited investigation of a small number of the design choices, with no listening test.
A mark of 55 can be obtained by doing the above, plus one further voice using either your additional recordings, or the ‘slt’ ARCTIC A+B recordings and performing more investigation of the various design choices, with a very simple listening test (possibly with only yourself as listener).
Students unable to complete either of the above should seek individual advice from the lecturer by email. You will be able to obtain a passing mark with a written report that focuses more on theory than practice.

2020-03-29 UPDATE: Level 11 minimum expectations:

A mark of 55 can be obtained by building one voice using your own recordings of ARCTIC A, plus one further voice using either your additional recordings, or the ‘slt’ ARCTIC A+B recordings, and performing an investigation of a few design choices, with a very simple listening test (possibly with only yourself as listener).
Students unable to complete the above should seek individual advice from the lecturer by email. You will be able to obtain a passing mark with a written report that focuses more on theory than practice.

Only you can decide what you wish to do, to achieve your personal goals.

What are your personal goals?

Some of you may wish to complete the course with the minimum amount of further practical work, perhaps because you have more important concerns at the moment. But others of you may wish to maximise your learning by attempting to complete the coursework as originally planned, possibly even going as far as making recordings away from Edinburgh.

Either of those positions, or anything in-between, is acceptable. There is a new forum to ask for help and clarification about what is achievable, or you can email the lecturer.

How good are your technical skills?

To do any further practical work, you need to get set up for remote working. This should be within the capabilities of most students, but if you cannot work remotely (e.g., you are on a very slow connection, have limited or no access to a suitable personal computer, do not have anywhere to work, or cannot get the software working) then you should seek individual advice from the lecturer by email.

How far have you got?

You will all be at different stages in the assignment, some ahead and some behind the milestones. I recommend the following options for completing the assignment, depending on which milestone you have reached. If you don’t fall into one of these categories, or have any doubts, please ask for help on the forum or by email to the lecturer.

Milestone C

You are behind schedule and should complete up to Milestone E, then follow the advice below, under “Milestone E”, but omit the following from the practical work:

Omit the sub-task of Milestone E “Implemented your automatic script design algorithm, or manually created your additional script “
Omit any other Milestone sub-tasks involving your own script and its recording
Use footnotes to explain where you were unable to complete the practical work

and take the following approach to the written report:

Address all sections of the structured marking scheme
In cases where you have not completed the relevant practical work, provide a discussion of the issues instead. For example, for “Critical thinking – Data” you could discuss the importance of coverage and how it is typically achieved.

Milestone D

This is the milestone just before Flexible Learning Week and the strike. You are therefore on-track, but should now complete up to Milestone E, then follow the advice below, under “Milestone E”.

Milestone E

You are on-track but face the big decision of whether to attempt more recordings. Choose between “Default advice” and “Advice for those determined to record their own domain-specific script”.

Default advice

Do not make more recordings. Instead, reconfigure your hypotheses and experiments to work within the constraints of the data you have available, comprising your own ARCTIC A recordings and the ARCTIC A+B recordings of speaker ‘slt’.

If the voice built from your own ARCTIC A recordings sounds reasonable and is intelligible, then use it as far as possible, including in part or all of your listening test. However, if your voice is very poor quality (after checking nothing went badly wrong in voice building), then use voices built from the ‘slt’ ARCTIC A+B recordings, and especially in all parts of your listening test.

I recommend the following practical work:

Use your own ARCTIC A recordings to explore the effect of design choices that don’t relate to quantity of data (e.g., pitch tracking settings), but fall back to using the ‘slt’ recordings if your own voice is very poor.
Use the ‘slt’ ARCTIC A+B recordings to explore the effect of design choices that do relate to quantity of data (e.g., forced alignment).
Complete the remaining milestones, omitting anything involving your domain script recordings
Use voices built from either your own ARCTIC A or the ‘slt’ ARCTIC A+B recordings in the listening test. But do not attempt to compare across the two data sets: only make comparisons between voices built from the same speaker’s recordings.
Run the listening test online (e.g., implemented in Qualtrics) and be flexible about listener demographics (e.g., do not insist on only using native speakers)
Omit milestone “I – optional follow-on listening test”

and the following approach to the written report:

Use your own ARCTIC A recordings to illustrate the voice building steps (the “Understanding” section of the structured marking scheme)
Report your additional script design and use a footnote to explain that you were unable to a make recording of it (the “Critical thinking – Data” section of the structured marking scheme).
Report listening test results for voices built from either your own ARCTIC A or the ‘slt’ ARCTIC A+B recordings. You will not be able to test any hypotheses related to domain, but there are plenty of other things to explore in a formal listening test, including some of the many other design choices.

Advice for those determined to record their own domain-specific script

This is entirely optional and only recommended for students who would be disappointed not to complete this aspect of the assignment. You should only proceed down this route if the voice built from your existing studio recordings is intelligible and sounds reasonably good.

It only makes sense to record your own domain script if you are also willing to re-record ARCTIC A, because they need to be closely matched, for two reasons: 1) comparing between a voice built only on ARCTIC A vs. a voice only built on your domain script will be unfair if the underlying recording quality is different; b) you can only combine the two sets of recordings in a single voice if they come from the same studio.

The SpeechRecorder tool used for making recordings is free, but is only available for Apple OS X. If you are on another operating system, making the recordings and dividing them into separate files would involve manual work and therefore cannot be recommended – sorry.

You will need a microphone to record with. A surprisingly good option is the microphone built in to wired Apple iPhone earbuds (the type with a 3.5mm 4-ring jack connector), which will plug in to a Mac’s headphone socket. Another option is a USB headset like you might use for Skype. The built-in microphone on a laptop is not ideal, because it’s not close-talking. If you’re in the market for a microphone for podcasting or videoconferencing, then the Blue Snowball iCE is excellent at that price point. Whatever you use, it needs to be used close to the mouth but not directly in front, just like in the recording studio.

Use a quiet place with lots of soft furnishings. Make a number of test recordings to perfect your setup and listen back carefully on headphones to check quality and recording level. You must minimise not only background noise but also reverberation.

You can now follow the advice under “Milestone F” below.

Milestone F or later

You are doing well and will be able to complete the assignment as originally planned, with the following modifications:

Run the listening test online (e.g., implemented in Qualtrics) and be flexible about listener demographics (e.g., do not insist on only using native speakers)
Omit milestone “I – optional follow-on listening test”