The Centre for Speech Technology Research, The university of Edinburgh

Publications by Zhizheng Wu

[1] Srikanth Ronanki, Gustav Eje Henter, Zhizheng Wu, and Simon King. A template-based approach for speech synthesis intonation generation using LSTMs. In Proc. Interspeech, San Francisco, USA, September 2016. [ bib | .pdf | Abstract ]
[2] Srikanth Ronanki, Zhizheng Wu, Oliver Watts, and Simon King. A Demonstration of the Merlin Open Source Neural Network Speech Synthesis System. In Proc. Speech Synthesis Workshop (SSW9), September 2016. [ bib | .pdf | Abstract ]
[3] Felipe Espic, Cassia Valentini-Botinhao, Zhizheng Wu, and Simon King. Waveform generation based on signal reshaping for statistical parametric speech synthesis. In Proc. Interspeech, pages 2263-2267, San Francisco, CA, USA, September 2016. [ bib | .PDF | Abstract ]
[4] Zhizheng Wu, Oliver Watts, and Simon King. Merlin: An open source neural network speech synthesis system. In 9th ISCA Speech Synthesis Workshop (2016), pages 218-223, September 2016. [ bib | .pdf | Abstract ]
[5] Gustav Eje Henter, Srikanth Ronanki, Oliver Watts, Mirjam Wester, Zhizheng Wu, and Simon King. Robust TTS duration modelling using DNNs. In Proc. ICASSP, volume 41, pages 5130-5134, Shanghai, China, March 2016. [ bib | http | .pdf | Abstract ]
[6] Oliver Watts, Gustav Eje Henter, Thomas Merritt, Zhizheng Wu, and Simon King. From HMMs to DNNs: where do the improvements come from? In Proc. ICASSP, volume 41, pages 5505-5509, Shanghai, China, March 2016. [ bib | http | .pdf | Abstract ]
[7] Tomoki Toda, Ling-Hui Chen, Daisuke Saito, Fernando Villavicencio, Mirjam Wester, Zhizheng Wu, and Junichi Yamagishi. The voice conversion challenge 2016. In Proc. Interspeech, 2016. [ bib | .pdf | Abstract ]
[8] Mirjam Wester, Zhizheng Wu, and Junichi Yamagishi. Analysis of the voice conversion challenge 2016 evaluation results. In Proc. Interspeech, 2016. [ bib | .pdf | Abstract ]
[9] Mirjam Wester, Zhizheng Wu, and Junichi Yamagishi. Multidimensional scaling of systems in the voice conversion challenge 2016. In Proc. Speech Synthesis Workshop 9, Sunnyvale, CA., 2016. [ bib | .pdf | Abstract ]
[10] Thomas Merritt, Robert A J Clark, Zhizheng Wu, Junichi Yamagishi, and Simon King. Deep neural network-guided unit selection synthesis. In Proc. ICASSP, 2016. [ bib | .pdf | Abstract ]
[11] Thomas Merritt, Srikanth Ronanki, Zhizheng Wu, and Oliver Watts. The CSTR entry to the Blizzard Challenge 2016. In Proc. Blizzard Challenge, 2016. [ bib | .pdf | Abstract ]
[12] C. Valentini-Botinhao, Z. Wu, and S. King. Towards minimum perceptual error training for DNN-based speech synthesis. In Proc. Interspeech, Dresden, Germany, September 2015. [ bib | .pdf | Abstract ]
[13] Thomas Merritt, Junichi Yamagishi, Zhizheng Wu, Oliver Watts, and Simon King. Deep neural network context embeddings for model selection in rich-context HMM synthesis. In Proc. Interspeech, Dresden, September 2015. [ bib | .pdf | Abstract ]
[14] Mirjam Wester, Zhizheng Wu, and Junichi Yamagishi. Human vs machine spoofing detection on wideband and narrowband data. In Proc. Interspeech, Dresden, September 2015. [ bib | .pdf | Abstract ]
[15] Qiong Hu, Zhizheng Wu, Korin Richmond, Junichi Yamagishi, Yannis Stylianou, and Ranniery Maia. Fusion of multiple parameterisations for DNN-based sinusoidal speech synthesis with multi-task learning. In Proc. Interspeech, Dresden, Germany, September 2015. [ bib | .pdf | Abstract ]
[16] Oliver Watts, Srikanth Ronanki, Zhizheng Wu, Tuomo Raitio, and Antti Suni. The NST-GlottHMM entry to the Blizzard Challenge 2015. In Proc. Blizzard Challenge Workshop (Interspeech Satellite), Berlin, Germany, September 2015. [ bib | .pdf | Abstract ]
[17] Oliver Watts, Srikanth Ronanki, Zhizheng Wu, Tuomo Raitio, and A. Suni. The nst-glotthmm entry to the blizzard challenge 2015. In Proceedings of Blizzard Challenge 2015, September 2015. [ bib | .pdf | Abstract ]
[18] Oliver Watts, Zhizheng Wu, and Simon King. Sentence-level control vectors for deep neural network speech synthesis. In INTERSPEECH 2015 16th Annual Conference of the International Speech Communication Association, pages 2217-2221. International Speech Communication Association, September 2015. [ bib | .pdf | Abstract ]
[19] Z. Wu, C. Valentini-Botinhao, O. Watts, and S. King. Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis. In Proc. ICASSP, pages 4460-4464, Brisbane, Australia, April 2015. [ bib | .pdf | Abstract ]
[20] Aleksandr Sizov, Elie Khoury, Tomi Kinnunen, Zhizheng Wu, and Sebastien Marcel. Joint speaker verification and antispoofing in the-vector space. IEEE Transactions on Information Forensics and Security, 10(4):821-832, 2015. [ bib | .pdf ]
[21] Zhizheng Wu and Simon King. Minimum trajectory error training for deep neural networks, combined with stacked bottleneck features. In Interspeech, 2015. [ bib | .pdf ]
[22] Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, and Simon King. A study of speaker adaptation for DNN-based speech synthesis. In Interspeech, 2015. [ bib | .pdf ]
[23] Zhizheng Wu, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Cemal Hanilci, Md Sahidullah, and Aleksandr Sizov. ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. In Interspeech, 2015. [ bib | .pdf ]
[24] Xiaohai Tian, Zhizheng Wu, Siu-Wa Lee, Quy Hy Nguyen, Minghui Dong, and Eng Siong Chng. System fusion for high-performance voice conversion. In Interspeech, 2015. [ bib | .pdf ]
[25] Zhizheng Wu, Cassia Valentini-Botinhao, Oliver Watts, and Simon King. Deep neural network employing multi-task learning and stacked bottleneck features for speech synthesis. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015. [ bib | .pdf ]
[26] Zhizheng Wu, Ali Khodabakhsh, Cenk Demiroglu, Junichi Yamagishi, Daisuke Saito, Tomoki Toda, and Simon King. SAS: A speaker verification spoofing database containing diverse attacks. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015. [ bib | .pdf ]
[27] Xiaohai Tian, Zhizheng Wu, Siu-Wa Lee, Quy Hy Nguyen, Eng Siong Chng, and Minghui Dong. Sparse representation for frequency warping based voice conversion. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015. [ bib | .pdf ]
[28] Nicholas W D Evans, Tomi Kinnunen, Junichi Yamagishi, Zhizheng Wu, Federico Alegre, and Phillip De Leon. Speaker recognition anti-spoofing. Book Chapter in "Handbook of Biometric Anti-spoofing", Springer, S. Marcel, S. Li and M. Nixon, Eds., 2014, June 2014. [ bib | DOI | .pdf | Abstract ]