Select one of the topics below and write a short critical review of the two papers for that topic, comparing and contrasting the different methods and outcomes discussed. The papers are varied: some are major works that have influenced the field, others report work-in-progress. Some are good, others not so good – you decide!
Writing tips
Don’t quote excessively from the papers – quotes should only be used when you need to show the reader the exact wording used by the author (e.g., so you can make a comment about it). It is much better to summarise the content in your own words (whilst citing the original sufficiently to make it clear who the ideas belong to). You won’t get marks for using jargon that you don’t understand. You will get marks for clear summaries and showing that you understood what you read. Relating what you read back to your experience with Festival and the material in the Speech Processing course would be a good idea.
It is not essential to discuss papers other than those provided here, but you may do so if you choose and you may obtain a higher mark if you do this well. You will get more marks for a deep review of a few papers, than for a shallow review of many papers.
If you find papers on the Web, you must ascertain their original method of publication and give that in the bibliography – do not cite by URL. You can of course also cite textbooks, lecture notes and slides where appropriate. Be very wary of citing material which has not been formally peer reviewed (e.g., Wikipedia or arXiv).
Do not cut and paste material from papers or webpages into your report – we have several ways of detecting this (even if you modify the text to disguise it!) and plagiarism will be dealt with severely.
Topic 1: Prosody and Intonation
- Ann Syrdal, Gregor Möhler, Kurt Dusterhoff, Alistair Conkie and Alan W Black, “Three Methods of Intonation Modeling”, in Proc. 3rd ESCA Workshop on Speech Synthesis, pages 305-310, Jenolan Caves, Australia, Nov 1998.
- Cameron S. Fordyce and Mari Ostendorf, “Prosody Prediction for Speech Synthesis using Transformational Rule-based Learning”, in Proc. Int. Conf. on Spoken Language Processing (ICSLP) volume 3 pages 843-846, Sydney, Australia, Nov-Dec 1998.
Topic 2: Waveform generation
- Robert E. Donovan and Ellen M. Eide, “The IBM Trainable Speech Synthesis System” , in Proc. Int. Conf. on Spoken Language Processing (ICSLP) volums 5, pages 1703–1706, Sydney, Australia, Nov-Dec 1998.
- “When a sailor in a small craft faces the might of the vast Atlantic Ocean today, he takes the same risks as generations took before him.”.
- Andrew J. Hunt and Alan W. Black, “Unit Selection in a Concatenative Speech Synthesis System using a Large Speech Database” in Proc. Int. Conf on Acoustics, Speech and Signal Processing (ICASSP), volume 1, pages 373-376, Atlanta, Georgia, USA, May 1996. DOI: 10.1109/ICASSP.1996.541110
- speaker f2b: “About ten thousand individuals and several businesses have made contributions.”
- speaker f2b: “Legal counsel for the NAACP’s Boston Branch.”
- speaker FKN (B08) “kouiu e ga, yasuyasu to kakeru hazu wa nai.”
- speaker MHO (B04): “wazuka na shuunyuu o yarikuri shite, genkin de, saabisu o riyou shite iru.”
- speaker MHO (D07): “minna, fuku ya nekutai no iro wa, yoku oboete iru.”
- speaker MTK (E02): “naegi eno aijou wa, kaette fukamatte ikuyoudatta”
Topic 3: Evaluation
- M. Edgington, “Investigating the Limitations of Concatenative Synthesis” in Proc. Eurospeech, volume 2, pages 593-596, Rhodes, Greece, Sep 1997.
- a0743s01
- a0743s02
- a0743s03
- a0743s04
- a0743s05
-
Christian Benoît, Martine Grice and Valérie Hazan, “The SUS test: a method for the assessment of text-to-speech synthesis intelligibility using Semantically Unpredictable Sentences”, Speech Communication, vol. 18, 381-392, 1996. DOI:10.1016/0167-6393(96)00026-X You may need to access through the electronic journals section of your library website since this journal is behind a paywall.
In case you missed it: just choose ONE of the three topics for your review! Do not review all three topics!