The Centre for Speech Technology Research, The university of Edinburgh

Publications by Rob Clark

[1] Jean-Philippe Goldman, Pierre-Edouard Honnet, Rob Clark, Philip N Garner, Maria Ivanova, Alexandros Lazaridis, Hui Liang, Tiago Macedo, Beat Pfister, Manuel Sam Ribeiro, et al. The SIWIS database: a multilingual speech database with acted emphasis. In Proceedings of Interspeech, San Francisco, United States, September 2016. [ bib | .PDF | Abstract ]
[2] Manuel Sam Ribeiro, Oliver Watts, Junichi Yamagishi, and Robert A. J. Clark. Wavelet-based decomposition of f0 as a secondary task for DNN-based speech synthesis with multi-task learning. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, March 2016. [ bib | .pdf | Abstract ]
[3] Adriana Stan, Yoshitaka Mamiya, Junichi Yamagishi, Peter Bell, Oliver Watts, Rob Clark, and Simon King. ALISA: An automatic lightly supervised speech segmentation and alignment tool. Computer Speech and Language, 35:116-133, 2016. [ bib | DOI | http | .pdf | Abstract ]
[4] Thomas Merritt, Robert A J Clark, Zhizheng Wu, Junichi Yamagishi, and Simon King. Deep neural network-guided unit selection synthesis. In Proc. ICASSP, 2016. [ bib | .pdf | Abstract ]
[5] Manuel Sam Ribeiro, Junichi Yamagishi, and Robert A. J. Clark. A perceptual investigation of wavelet-based decomposition of f0 for text-to-speech synthesis. In Proc. Interspeech, Dresden, Germany, September 2015. [ bib | .pdf | Abstract ]
[6] Manuel Sam Ribeiro and Robert A. J. Clark. A multi-level representation of f0 using the continuous wavelet transform and the discrete cosine transform. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, Brisbane, Australia, April 2015. [ bib | .pdf | Abstract ]
[7] Wei Zhang, Robert A. J. Clark, and Yongyuan Wang. Unsupervised language filtering using the latent Dirichlet allocation. In Proc. Interspeech, pages 1268-1272, September 2014. [ bib | .pdf | Abstract ]
[8] Susana Palmaz López-Peláez and Robert A. J. Clark. Speech synthesis reactive to dynamic noise environmental conditions. In Proc. Interspeech, pages 2927-2931, September 2014. [ bib | .pdf | Abstract ]
[9] Philip N Garner, Rob Clark, Jean-Philippe Goldman, Pierre-Edouard Honnet, Maria Ivanova, Alexandros Lazaridis, Hui Liang, Beat Pfister, Manuel Sam Ribeiro, Eric Wehrli, et al. Translation and prosody in swiss languages. In Nouveaux cahiers de linguistique francaise, 31. 3rd Swiss Workshop on Prosody, Geneva, Switzerland, September 2014. [ bib | .pdf | Abstract ]
[10] David Abelman and Robert Clark. Altering speech synthesis prosody through real time natural gestural control. In Proc. Speech Prosody 2014, Dublin Ireland, 2014. [ bib | .pdf | Abstract ]
[11] Yoshitaka Mamiya, Adriana Stan, Junichi Yamagishi, Peter Bell, Oliver Watts, Robert Clark, and Simon King. Using adaptation to improve speech transcription alignment in noisy and reverberant environments. In 8th ISCA Workshop on Speech Synthesis, pages 61-66, Barcelona, Spain, August 2013. [ bib | .pdf | Abstract ]
[12] Oliver Watts, Adriana Stan, Rob Clark, Yoshitaka Mamiya, Mircea Giurgiu, Junichi Yamagishi, and Simon King. Unsupervised and lightly-supervised learning for rapid construction of TTS systems in multiple languages from 'found' data: evaluation and analysis. In 8th ISCA Workshop on Speech Synthesis, pages 121-126, Barcelona, Spain, August 2013. [ bib | .pdf | Abstract ]
[13] Adriana Stan, Oliver Watts, Yoshitaka Mamiya, Mircea Giurgiu, Rob Clark, Junichi Yamagishi, and Simon King. TUNDRA: A Multilingual Corpus of Found Data for TTS Research Created with Light Supervision. In Proc. Interspeech, Lyon, France, August 2013. [ bib | .pdf | Abstract ]
[14] Àngel Calzada Defez, Joan Claudi Socoró Carrié, and Robert Clark. Parametric model for vocal effort interpolation with harmonics plus noise models. In Proc. 8th ISCA Speech Synthesis Workshop, pages 25-30, 2013. [ bib | .pdf | Abstract ]
[15] Yoshitaka Mamiya, Junichi Yamagishi, Oliver Watts, Robert A.J. Clark, Simon King, and Adriana Stan. Lightly supervised gmm vad to use audiobook for speech synthesiser. In Proc. ICASSP, 2013. [ bib | .pdf | Abstract ]
[16] Catherine Mayo, Fiona Gibbon, and Robert A. J. Clark. Phonetically trained and untrained adults' transcription of place of articulation for intervocalic lingual stops with intermediate acoustic cues. Journal of Speech, Language and Hearing Research, 56:779-791, 2013. [ bib | DOI | Abstract ]
[17] Sebastian Andersson, Junichi Yamagishi, and Robert A.J. Clark. Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis. Speech Communication, 54(2):175-188, 2012. [ bib | DOI | http | Abstract ]
[18] S. Andersson, J. Yamagishi, and R.A.J. Clark. Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis. Speech Communication, 54(2):175-188, 2012. [ bib | DOI | Abstract ]
[19] Leonardo Badino, Robert A.J. Clark, and Mirjam Wester. Towards hierarchical prosodic prominence generation in TTS synthesis. In Proc. Interspeech, Portland, USA, 2012. [ bib | .pdf ]
[20] Anna C. Janska, Erich Schröger, Thomas Jacobsen, and Robert A. J. Clark. Asymmetries in the perception of synthesized speech. In Proc. Interspeech, Portland, USA, 2012. [ bib | .pdf ]
[21] A. G. Pipe, R. Vaidyanathan, C. Melhuish, P. Bremner, P. Robinson, R. A. J. Clark, A. Lenz, K. Eder, N. Hawes, Z. Ghahramani, M. Fraser, M. Mermehdi, P. Healey, and S. Skachek. Affective robotics: Human motion and behavioural inspiration for cooperation between humans and assistive robots. In Yoseph Bar-Cohen, editor, Biomimetics: Nature-Based Innovation, chapter 15. Taylor and Francis, 2011. [ bib ]
[22] C. Mayo, R. A. J. Clark, and S. King. Listeners' weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis. Speech Communication, 53(3):311-326, 2011. [ bib | DOI | Abstract ]
[23] Korin Richmond, Robert Clark, and Sue Fitt. On generating Combilex pronunciations via morphological analysis. In Proc. Interspeech, pages 1974-1977, Makuhari, Japan, September 2010. [ bib | .pdf | Abstract ]
[24] Sebastian Andersson, Junichi Yamagishi, and Robert Clark. Utilising spontaneous conversational speech in HMM-based speech synthesis. In The 7th ISCA Tutorial and Research Workshop on Speech Synthesis, September 2010. [ bib | .pdf | Abstract ]
[25] Sebastian Andersson, Kallirroi Georgila, David Traum, Matthew Aylett, and Robert Clark. Prediction and realisation of conversational characteristics by utilising spontaneous speech for unit selection. In Speech Prosody 2010, May 2010. [ bib | .pdf | Abstract ]
[26] Anna C. Janska and Robert A. J. Clark. Native and non-native speaker judgements on the quality of synthesized speech. In Proc. Interspeech, pages 1121-1124, 2010. [ bib | .pdf | Abstract ]
[27] Michael White, Robert A. J. Clark, and Johanna D. Moore. Generating tailored, comparative descriptions with contextually appropriate intonation. Computational Linguistics, 36(2):159-201, 2010. [ bib | DOI | Abstract ]
[28] Anna C. Janska and Robert A. J. Clark. Further exploration of the possibilities and pitfalls of multidimensional scaling as a tool for the evaluation of the quality of synthesized speech. In The 7th ISCA Tutorial and Research Workshop on Speech Synthesis, pages 142-147, 2010. [ bib | .pdf | Abstract ]
[29] J. Sebastian Andersson, Joao P. Cabral, Leonardo Badino, Junichi Yamagishi, and Robert A.J. Clark. Glottal source and prosodic prominence modelling in HMM-based speech synthesis for the Blizzard Challenge 2009. In The Blizzard Challenge 2009, Edinburgh, U.K., September 2009. [ bib | .pdf | Abstract ]
[30] Leonardo Badino, J. Sebastian Andersson, Junichi Yamagishi, and Robert A.J. Clark. Identification of contrast and its emphatic realization in HMM-based speech synthesis. In Proc. Interspeech 2009, Brighton, U.K., September 2009. [ bib | .PDF | Abstract ]
[31] K. Richmond, R. Clark, and S. Fitt. Robust LTS rules with the Combilex speech technology lexicon. In Proc. Interspeech, pages 1295-1298, Brighton, UK, September 2009. [ bib | .pdf | Abstract ]
[32] Vasilis Karaiskos, Simon King, Robert A. J. Clark, and Catherine Mayo. The blizzard challenge 2008. In Proc. Blizzard Challenge Workshop, Brisbane, Australia, September 2008. [ bib | .pdf | Abstract ]
[33] Leonardo Badino, Robert A.J. Clark, and Volker Strom. Including pitch accent optionality in unit selection text-to-speech synthesis. In Proc. Interspeech, Brisbane, 2008. [ bib | .ps | .pdf | Abstract ]
[34] Maggie Morgan, Marilyn R. McGee-Lennon, Nick Hine, John Arnott, Chris Martin, Julia S. Clark, and Maria Wolters. Requirements gathering with diverse user groups and stakeholders. In Proc. 26th Conference on Computer-Human Interaction, Florence, 2008. [ bib ]
[35] Leonardo Badino and Robert A.J. Clark. Automatic labeling of contrastive word pairs from spontaneous spoken english. In in 2008 IEEE/ACL Workshop on Spoken Language Technology, Goa, India, 2008. [ bib | .pdf | Abstract ]
[36] Robert A. J. Clark, Monika Podsiadlo, Mark Fraser, Catherine Mayo, and Simon King. Statistical analysis of the Blizzard Challenge 2007 listening test results. In Proc. Blizzard 2007 (in Proc. Sixth ISCA Workshop on Speech Synthesis), Bonn, Germany, August 2007. [ bib | .pdf | Abstract ]
[37] Volker Strom, Ani Nenkova, Robert Clark, Yolanda Vazquez-Alvarez, Jason Brenier, Simon King, and Dan Jurafsky. Modelling prominence and emphasis improves unit-selection synthesis. In Proc. Interspeech 2007, Antwerp, Belgium, August 2007. [ bib | .pdf | Abstract ]
[38] K. Richmond, V. Strom, R. Clark, J. Yamagishi, and S. Fitt. Festival multisyn voices for the 2007 blizzard challenge. In Proc. Blizzard Challenge Workshop (in Proc. SSW6), Bonn, Germany, August 2007. [ bib | .pdf | Abstract ]
[39] Leonardo Badino and Robert A.J. Clark. Issues of optionality in pitch accent placement. In Proc. 6th ISCA Speech Synthesis Workshop, Bonn, Germany, 2007. [ bib | .pdf | Abstract ]
[40] Robert A. J. Clark, Korin Richmond, and Simon King. Multisyn: Open-domain unit selection for the Festival speech synthesis system. Speech Communication, 49(4):317-330, 2007. [ bib | DOI | .pdf | Abstract ]
[41] R. Clark, K. Richmond, V. Strom, and S. King. Multisyn voices for the Blizzard Challenge 2006. In Proc. Blizzard Challenge Workshop (Interspeech Satellite), Pittsburgh, USA, September 2006. (http://festvox.org/blizzard/blizzard2006.html). [ bib | .pdf | Abstract ]
[42] Robert A. J. Clark and Simon King. Joint prosodic and segmental unit selection speech synthesis. In Proc. Interspeech 2006, Pittsburgh, USA, September 2006. [ bib | .ps | .pdf | Abstract ]
[43] Volker Strom, Robert Clark, and Simon King. Expressive prosody for unit-selection speech synthesis. In Proc. Interspeech, Pittsburgh, 2006. [ bib | .ps | .pdf | Abstract ]
[44] Robert A.J. Clark, Korin Richmond, and Simon King. Multisyn voices from ARCTIC data for the Blizzard challenge. In Proc. Interspeech 2005, September 2005. [ bib | .pdf | Abstract ]
[45] C. Mayo, R. A. J. Clark, and S. King. Multidimensional scaling of listener responses to synthetic speech. In Proc. Interspeech 2005, Lisbon, Portugal, September 2005. [ bib | .pdf ]
[46] G. Hofer, K. Richmond, and R. Clark. Informed blending of databases for emotional speech synthesis. In Proc. Interspeech, September 2005. [ bib | .ps | .pdf | Abstract ]
[47] Dominika Oliver and Robert A. J. Clark. Modelling pitch accent types for Polish speech synthesis. In Proc. Interspeech 2005, 2005. [ bib | .pdf ]
[48] Robert A.J. Clark, Korin Richmond, and Simon King. Festival 2 - build your own general purpose unit selection speech synthesiser. In Proc. 5th ISCA workshop on speech synthesis, 2004. [ bib | .ps | .pdf | Abstract ]
[49] Rachel Baker, Robert A.J. Clark, and Michael White. Synthesising contextually appropriate intonation in limited domains. In Proc. 5th ISCA workshop on speech synthesis, Pittsburgh, USA, 2004. [ bib | .ps | .pdf ]
[50] Robert A. J. Clark. Generating Synthetic Pitch Contours Using Prosodic Structure. PhD thesis, The University of Edinburgh, 2003. [ bib | .ps.gz | .pdf ]
[51] Robert A. J. Clark. Modelling pitch accents for concept-to-speech synthesis. In Proc. XVth International Congress of Phonetic Sciences, volume 2, pages 1141-1144, 2003. [ bib | .ps | .pdf ]
[52] Robert A. J. Clark. Using prosodic structure to improve pitch range variation in text to speech synthesis. In Proc. XIVth international congress of phonetic sciences, volume 1, pages 69-72, 1999. [ bib | .ps | .pdf ]
[53] Robert. A. J. Clark and Kurt E. Dusterhoff. Objective methods for evaluating synthetic intonation. In Proc. Eurospeech 1999, volume 4, pages 1623-1626, 1999. [ bib | .ps | .pdf ]
[54] Robert A. J. Clark. Language acquisition and implication for language change: A computational model. In Proceedings of the GALA 97 Conference on Language Acquisition, pages 322-326, 1997. [ bib | .ps | .pdf ]
[55] Robert A.J. Clark. Internal and external factors affecting language change: A computational model. Master's thesis, University of Edinburgh, 1996. [ bib | .ps | .pdf ]