The Centre for Speech Technology Research, The university of Edinburgh

Publications by Korin Richmond

[1] Rasmus Dall, Sandrine Brognaux, Korin Richmond, Cassia Valentini-Botinhao, Gustav Eje Henter, Julia Hirschberg, and Junichi Yamagishi. Testing the consistency assumption: pronunciation variant forced alignment in read and spontaneous speech synthesis. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5155-5159, March 2016. [ bib | .pdf | Abstract ]
[2] Qiong Hu, Junichi Yamagishi, Korin Richmond, Kartick Subramanian, and Yannis Stylianou. Initial investigation of speech synthesis based on complex-valued neural networks. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5630-5634, March 2016. [ bib | .pdf | Abstract ]
[3] Korin Richmond and Simon King. Smooth talking: Articulatory join costs for unit selection. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5150-5154, March 2016. [ bib | .pdf | Abstract ]
[4] Qiong Hu, Zhizheng Wu, Korin Richmond, Junichi Yamagishi, Yannis Stylianou, and Ranniery Maia. Fusion of multiple parameterisations for DNN-based sinusoidal speech synthesis with multi-task learning. In Proc. Interspeech, Dresden, Germany, September 2015. [ bib | .pdf | Abstract ]
[5] Alexander Hewer, Ingmar Steiner, Timo Bolkart, Stefanie Wuhrer, and Korin Richmond. A statistical shape space model of the palate surface trained on 3D MRI scans of the vocal tract. In The Scottish Consortium for ICPhS 2015, editor, Proceedings of the 18th International Congress of Phonetic Sciences, Glasgow, United Kingdom, August 2015. retrieved from http://www.icphs2015.info/pdfs/Papers/ICPHS0724.pdf. [ bib | .pdf | Abstract ]
[6] Qiong Hu, Yannis Stylianou, Ranniery Maia, Korin Richmond, and Junichi Yamagishi. Methods for applying dynamic sinusoidal models to statistical parametric speech synthesis. In Proc. ICASSP, Brisbane, Austrilia, April 2015. [ bib | .pdf | Abstract ]
[7] Alexander Hewer, Stefanie Wuhrer, Ingmar Steiner, and Korin Richmond. Tongue mesh extraction from 3D MRI data of the human vocal tract. In Michael Breuß, Alfred M. Bruckstein, Petros Maragos, and Stefanie Wuhrer, editors, Perspectives in Shape Analysis, Mathematics and Visualization. Springer, 2015. (in press). [ bib ]
[8] Korin Richmond, Zhen-Hua Ling, and Junichi Yamagishi. The use of articulatory movement data in speech synthesis applications: An overview - application of articulatory movements using machine learning algorithms [invited review]. Acoustical Science and Technology, 36(6):467-477, 2015. [ bib | DOI ]
[9] Korin Richmond, Junichi Yamagishi, and Zhen-Hua Ling. Applications of articulatory movements based on machine learning. Journal of the Acoustical Society of Japan, 70(10):539-545, 2015. [ bib ]
[10] Qiong Hu, Yannis Stylianou, Ranniery Maia, Korin Richmond, Junichi Yamagishi, and Javier Latorre. An investigation of the application of dynamic sinusoidal models to statistical parametric speech synthesis. In Proc. Interspeech, pages 780-784, Singapore, September 2014. [ bib | .pdf | Abstract ]
[11] Qiong Hu, Yannis Stylianou, Korin Richmond, Ranniery Maia, Junichi Yamagishi, and Javier Latorre. A fixed dimension and perceptually based dynamic sinusoidal model of speech. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 6311-6315, Florence, Italy, May 2014. [ bib | .pdf | Abstract ]
[12] J.P. Cabral, K. Richmond, J. Yamagishi, and S. Renals. Glottal spectral separation for speech synthesis. Selected Topics in Signal Processing, IEEE Journal of, 8(2):195-208, April 2014. [ bib | DOI | .pdf | Abstract ]
[13] Maria Astrinaki, Alexis Moinet, Junichi Yamagishi, Korin Richmond, Zhen-Hua Ling, Simon King, and Thierry Dutoit. Mage - reactive articulatory feature control of HMM-based parametric speech synthesis. In 8th ISCA Workshop on Speech Synthesis, pages 227-231, Barcelona, Spain, August 2013. [ bib | .pdf ]
[14] Qiong Hu, Korin Richmond, Junichi Yamagishi, and Javier Latorre. An experimental comparison of multiple vocoder types. In 8th ISCA Workshop on Speech Synthesis, pages 155-160, Barcelona, Spain, August 2013. [ bib | .pdf | Abstract ]
[15] Korin Richmond, Zhenhua Ling, Junichi Yamagishi, and Benigno Uría. On the evaluation of inversion mapping performance in the acoustic domain. In Proc. Interspeech, pages 1012-1016, Lyon, France, August 2013. [ bib | .pdf | Abstract ]
[16] James Scobbie, Alice Turk, Christian Geng, Simon King, Robin Lickley, and Korin Richmond. The Edinburgh speech production facility DoubleTalk corpus. In Proc. Interspeech, Lyon, France, August 2013. [ bib | .pdf | Abstract ]
[17] Maria Astrinaki, Alexis Moinet, Junichi Yamagishi, Korin Richmond, Zhen-Hua Ling, Simon King, and Thierry Dutoit. Mage - HMM-based speech synthesis reactively controlled by the articulators. In 8th ISCA Workshop on Speech Synthesis, page 243, Barcelona, Spain, August 2013. [ bib | .pdf | Abstract ]
[18] Z. Ling, K. Richmond, and J. Yamagishi. Articulatory control of HMM-based parametric speech synthesis using feature-space-switched multiple regression. Audio, Speech, and Language Processing, IEEE Transactions on, 21(1):207-219, 2013. [ bib | DOI | .pdf | Abstract ]
[19] Christian Geng, Alice Turk, James M. Scobbie, Cedric Macmartin, Philip Hoole, Korin Richmond, Alan Wrench, Marianne Pouplier, Ellen Gurman Bard, Ziggy Campbell, Catherine Dickie, Eddie Dubourg, William Hardcastle, Evia Kainada, Simon King, Robin Lickley, Satsuki Nakai, Steve Renals, Kevin White, and Ronny Wiegand. Recording speech articulation in dialogue: Evaluating a synchronized double electromagnetic articulography setup. Journal of Phonetics, 41(6):421 - 431, 2013. [ bib | DOI | http | .pdf | Abstract ]
[20] Ingmar Steiner, Korin Richmond, and Slim Ouni. Speech animation using electromagnetic articulography as motion capture data. In Proc. 12th International Conference on Auditory-Visual Speech Processing, pages 55-60, Annecy, France, 2013. [ bib | .pdf | Abstract ]
[21] Korin Richmond and Steve Renals. Ultrax: An animated midsagittal vocal tract display for speech therapy. In Proc. Interspeech, Portland, Oregon, USA, September 2012. [ bib | .pdf | Abstract ]
[22] Zhen-Hua Ling, Korin Richmond, and Junichi Yamagishi. Vowel creation by articulatory control in HMM-based parametric speech synthesis. In Proc. Interspeech, Portland, Oregon, USA, September 2012. [ bib | .pdf | Abstract ]
[23] Benigno Uria, Iain Murray, Steve Renals, and Korin Richmond. Deep architectures for articulatory inversion. In Proc. Interspeech, Portland, Oregon, USA, September 2012. [ bib | .pdf | Abstract ]
[24] Zhenhua Ling, Korin Richmond, and Junichi Yamagishi. Vowel creation by articulatory control in HMM-based parametric speech synthesis. In Proc. The Listening Talker Workshop, page 72, Edinburgh, UK, May 2012. [ bib | .pdf ]
[25] Ingmar Steiner, Korin Richmond, Ian Marshall, and Calum D. Gray. The magnetic resonance imaging subset of the mngu0 articulatory corpus. The Journal of the Acoustical Society of America, 131(2):EL106-EL111, January 2012. [ bib | DOI | .pdf | Abstract ]
[26] Ingmar Steiner, Korin Richmond, and Slim Ouni. Using multimodal speech production data to evaluate articulatory animation for audiovisual speech synthesis. In 3rd International Symposium on Facial Analysis and Animation, Vienna, Austria, 2012. [ bib | .pdf ]
[27] Benigno Uria, Steve Renals, and Korin Richmond. A deep neural network for acoustic-articulatory speech inversion. In Proc. NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning, Sierra Nevada, Spain, December 2011. [ bib | .pdf | Abstract ]
[28] Korin Richmond, Phil Hoole, and Simon King. Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus. In Proc. Interspeech, pages 1505-1508, Florence, Italy, August 2011. [ bib | .pdf | Abstract ]
[29] Ming Lei, Junichi Yamagishi, Korin Richmond, Zhen-Hua Ling, Simon King, and Li-Rong Dai. Formant-controlled HMM-based speech synthesis. In Proc. Interspeech, pages 2777-2780, Florence, Italy, August 2011. [ bib | .pdf | Abstract ]
[30] Zhen-Hua Ling, Korin Richmond, and Junichi Yamagishi. Feature-space transform tying in unified acoustic-articulatory modelling of articulatory control of HMM-based speech synthesis. In Proc. Interspeech, pages 117-120, Florence, Italy, August 2011. [ bib | .pdf | Abstract ]
[31] J.P. Cabral, S. Renals, J. Yamagishi, and K. Richmond. HMM-based speech synthesiser using the LF-model of the glottal source. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pages 4704-4707, May 2011. [ bib | DOI | .pdf | Abstract ]
[32] Zhen-Hua Ling, Korin Richmond, and Junichi Yamagishi. An analysis of HMM-based prediction of articulatory movements. Speech Communication, 52(10):834-846, October 2010. [ bib | DOI | Abstract ]
[33] Zhen-Hua Ling, Korin Richmond, and Junichi Yamagishi. HMM-based text-to-articulatory-movement prediction and analysis of critical articulators. In Proc. Interspeech, pages 2194-2197, Makuhari, Japan, September 2010. [ bib | .pdf | Abstract ]
[34] Daniel Felps, Christian Geng, Michael Berger, Korin Richmond, and Ricardo Gutierrez-Osuna. Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database. In Proc. Interspeech, pages 1990-1993, September 2010. [ bib | .pdf | Abstract ]
[35] Korin Richmond, Robert Clark, and Sue Fitt. On generating Combilex pronunciations via morphological analysis. In Proc. Interspeech, pages 1974-1977, Makuhari, Japan, September 2010. [ bib | .pdf | Abstract ]
[36] João Cabral, Steve Renals, Korin Richmond, and Junichi Yamagishi. Transforming voice source parameters in a HMM-based speech synthesiser with glottal post-filtering. In Proc. 7th ISCA Speech Synthesis Workshop (SSW7), pages 365-370, NICT/ATR, Kyoto, Japan, September 2010. [ bib | .pdf | Abstract ]
[37] Gregor Hofer and Korin Richmond. Comparison of HMM and TMDN methods for lip synchronisation. In Proc. Interspeech, pages 454-457, Makuhari, Japan, September 2010. [ bib | .pdf | Abstract ]
[38] Alice Turk, James Scobbie, Christian Geng, Barry Campbell, Catherine Dickie, Eddie Dubourg, Ellen Gurman Bard, William Hardcastle, Mariam Hartinger, Simon King, Robin Lickley, Cedric Macmartin, Satsuki Nakai, Steve Renals, Korin Richmond, Sonja Schaeffler, Kevin White, Ronny Wiegand, and Alan Wrench. An Edinburgh speech production facility. Poster presented at the 12th Conference on Laboratory Phonology, Albuquerque, New Mexico., July 2010. [ bib | .pdf ]
[39] Gregor Hofer, Korin Richmond, and Michael Berger. Lip synchronization by acoustic inversion. Poster at Siggraph 2010, 2010. [ bib | .pdf ]
[40] Alice Turk, James Scobbie, Christian Geng, Cedric Macmartin, Ellen Bard, Barry Campbell, Catherine Dickie, Eddie Dubourg, Bill Hardcastle, Phil Hoole, Evia Kanaida, Robin Lickley, Satsuki Nakai, Marianne Pouplier, Simon King, Steve Renals, Korin Richmond, Sonja Schaeffler, Ronnie Wiegand, Kevin White, and Alan Wrench. The Edinburgh Speech Production Facility's articulatory corpus of spontaneous dialogue. The Journal of the Acoustical Society of America, 128(4):2429-2429, 2010. [ bib | DOI | Abstract ]
[41] K. Richmond. Preliminary inversion mapping results with a new EMA corpus. In Proc. Interspeech, pages 2835-2838, Brighton, UK, September 2009. [ bib | .pdf | Abstract ]
[42] K. Richmond, R. Clark, and S. Fitt. Robust LTS rules with the Combilex speech technology lexicon. In Proc. Interspeech, pages 1295-1298, Brighton, UK, September 2009. [ bib | .pdf | Abstract ]
[43] I. Steiner and K. Richmond. Towards unsupervised articulatory resynthesis of German utterances using EMA data. In Proc. Interspeech, pages 2055-2058, Brighton, UK, September 2009. [ bib | .pdf | Abstract ]
[44] Z. Ling, K. Richmond, J. Yamagishi, and R. Wang. Integrating articulatory features into HMM-based parametric speech synthesis. IEEE Transactions on Audio, Speech and Language Processing, 17(6):1171-1185, August 2009. IEEE SPS 2010 Young Author Best Paper Award. [ bib | DOI | Abstract ]
[45] J. Cabral, S. Renals, K. Richmond, and J. Yamagishi. HMM-based speech synthesis with an acoustic glottal source model. In Proc. The First Young Researchers Workshop in Speech Technology, April 2009. [ bib | .pdf | Abstract ]
[46] I. Steiner and K. Richmond. Generating gestural timing from EMA data using articulatory resynthesis. In Proc. 8th International Seminar on Speech Production, Strasbourg, France, December 2008. [ bib | Abstract ]
[47] Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi, and Ren-Hua Wang. Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge. In Proc. Interspeech, pages 573-576, Brisbane, Australia, September 2008. [ bib | .PDF | Abstract ]
[48] C. Qin, M. Carreira-Perpiñán, K. Richmond, A. Wrench, and S. Renals. Predicting tongue shapes from a few landmark locations. In Proc. Interspeech, pages 2306-2309, Brisbane, Australia, September 2008. [ bib | .PDF | Abstract ]
[49] J. Cabral, S. Renals, K. Richmond, and J. Yamagishi. Glottal spectral separation for parametric speech synthesis. In Proc. Interspeech, pages 1829-1832, Brisbane, Australia, September 2008. [ bib | .PDF | Abstract ]
[50] K. Richmond. Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion. In M. Chetouani, A. Hussain, B. Gas, M. Milgram, and J.-L. Zarader, editors, Advances in Nonlinear Speech Processing, International Conference on Non-Linear Speech Processing, NOLISP 2007, volume 4885 of Lecture Notes in Computer Science, pages 263-272. Springer-Verlag Berlin Heidelberg, December 2007. [ bib | DOI | .pdf | Abstract ]
[51] K. Richmond. A multitask learning perspective on acoustic-articulatory inversion. In Proc. Interspeech, Antwerp, Belgium, August 2007. [ bib | .pdf | Abstract ]
[52] K. Richmond, V. Strom, R. Clark, J. Yamagishi, and S. Fitt. Festival multisyn voices for the 2007 blizzard challenge. In Proc. Blizzard Challenge Workshop (in Proc. SSW6), Bonn, Germany, August 2007. [ bib | .pdf | Abstract ]
[53] S. King, J. Frankel, K. Livescu, E. McDermott, K. Richmond, and M. Wester. Speech production knowledge in automatic speech recognition. Journal of the Acoustical Society of America, 121(2):723-742, February 2007. [ bib | .pdf | Abstract ]
[54] J. Cabral, S. Renals, K. Richmond, and J. Yamagishi. Towards an improved modeling of the glottal source in statistical parametric speech synthesis. In Proc.of the 6th ISCA Workshop on Speech Synthesis, Bonn, Germany, 2007. [ bib | .pdf | Abstract ]
[55] Robert A. J. Clark, Korin Richmond, and Simon King. Multisyn: Open-domain unit selection for the Festival speech synthesis system. Speech Communication, 49(4):317-330, 2007. [ bib | DOI | .pdf | Abstract ]
[56] Sue Fitt and Korin Richmond. Redundancy and productivity in the speech technology lexicon - can we do better? In Proc. Interspeech 2006, September 2006. [ bib | .pdf | Abstract ]
[57] R. Clark, K. Richmond, V. Strom, and S. King. Multisyn voices for the Blizzard Challenge 2006. In Proc. Blizzard Challenge Workshop (Interspeech Satellite), Pittsburgh, USA, September 2006. (http://festvox.org/blizzard/blizzard2006.html). [ bib | .pdf | Abstract ]
[58] K. Richmond. A trajectory mixture density network for the acoustic-articulatory inversion mapping. In Proc. Interspeech, Pittsburgh, USA, September 2006. [ bib | .pdf | Abstract ]
[59] Robert A.J. Clark, Korin Richmond, and Simon King. Multisyn voices from ARCTIC data for the Blizzard challenge. In Proc. Interspeech 2005, September 2005. [ bib | .pdf | Abstract ]
[60] G. Hofer, K. Richmond, and R. Clark. Informed blending of databases for emotional speech synthesis. In Proc. Interspeech, September 2005. [ bib | .ps | .pdf | Abstract ]
[61] L. Onnis, P. Monaghan, K. Richmond, and N. Chater. Phonology impacts segmentation in speech processing. Journal of Memory and Language, 53(2):225-237, 2005. [ bib | .pdf | Abstract ]
[62] D. Toney, D. Feinberg, and K. Richmond. Acoustic features for profiling mobile users of conversational interfaces. In S. Brewster and M. Dunlop, editors, 6th International Symposium on Mobile Human-Computer Interaction - MobileHCI 2004, pages 394-398, Glasgow, Scotland, September 2004. Springer. [ bib | Abstract ]
[63] Robert A.J. Clark, Korin Richmond, and Simon King. Festival 2 - build your own general purpose unit selection speech synthesiser. In Proc. 5th ISCA workshop on speech synthesis, 2004. [ bib | .ps | .pdf | Abstract ]
[64] K. Richmond, S. King, and P. Taylor. Modelling the uncertainty in recovering articulation from acoustics. Computer Speech and Language, 17:153-172, 2003. [ bib | .pdf | Abstract ]
[65] K. Richmond. Estimating Articulatory Parameters from the Acoustic Speech Signal. PhD thesis, The Centre for Speech Technology Research, Edinburgh University, 2002. [ bib | .ps | Abstract ]
[66] K. Richmond. Mixture density networks, human articulatory data and acoustic-to-articulatory inversion of continuous speech. In Proc. Workshop on Innovation in Speech Processing, pages 259-276. Institute of Acoustics, April 2001. [ bib | .ps ]
[67] A. Wrench and K. Richmond. Continuous speech recognition using articulatory data. In Proc. ICSLP 2000, Beijing, China, 2000. [ bib | .ps | .pdf | Abstract ]
[68] J. Frankel, K. Richmond, S. King, and P. Taylor. An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces. In Proc. ICSLP, 2000. [ bib | .ps | .pdf | Abstract ]
[69] S. King, P. Taylor, J. Frankel, and K. Richmond. Speech recognition via phonetically-featured syllables. In PHONUS, volume 5, pages 15-34, Institute of Phonetics, University of the Saarland, 2000. [ bib | .ps | .pdf | Abstract ]
[70] K. Richmond. Estimating velum height from acoustics during continuous speech. In Proc. Eurospeech, volume 1, pages 149-152, Budapest, Hungary, 1999. [ bib | .ps | .pdf | Abstract ]
[71] K. Richmond. A proposal for the compartmental modelling of stellate cells in the anteroventral cochlear nucleus, using realistic auditory nerve inputs. Master's thesis, Centre for Cognitive Science, University of Edinburgh, September 1997. [ bib ]
[72] K. Richmond, A. Smith, and E. Amitay. Detecting subject boundaries within text: A language-independent statistical approach. In Proc. The Second Conference on Empirical Methods in Natural Language Processing, pages 47-54, Brown University, Providence, USA, August 1997. [ bib | .ps | .pdf | Abstract ]