Hiroshi Shimodaira and Mitsuru Nakai. Robust pitch detection by narrow band spectrum analysis. In Proc. ICSLP-92, pages 1597-1600, October 1992. [ bib | .pdf ]

This paper proposes a new technique for detecting pitch patterns which is useful for automatic speech recognition, by using a narrow band spectrum analysis. The motivation of this approach is that humans perceive some kind of pitch in whispers where no fundamental frequencies can be observed, while most of the pitch determination algorithm (PDA) fails to detect such perceptual pitch. The narrow band spectrum analysis enable us to find pitch structure distributed locally in frequency domain. Incorporating this technique into PDA's is realized to applying the technique to the lag window based PDA. Experimental results show that pitch detection performance could be improved by 4% for voiced sounds and 8% for voiceless sounds.

P. C. Bagshaw. Criteria for labelling prosodic aspects of English speech. In Proc. 4th. Australian International Conference on Speech Science and Technology, Brisbane, Australia, 1992. [ bib | .ps | .pdf ]

James L. Hieronymus and Briony J. Williams. A comparison of the prosody in read speech and directed monologue in British English. In Proceedings of the ESCA Workshop on the Phonetics and Phonology of Speaking Styles, Barcelona, Spain, 1992. [ bib ]

H. Bourlard, N. Morgan, and S. Renals. Neural nets and hidden Markov models: Review and generalizations. Speech Communication, 11:237-246, 1992. [ bib ]

S. Renals, N. Morgan, M. Cohen, and H. Franco. Connectionist probability estimation in the Decipher speech recognition system. In Proc IEEE ICASSP, pages 601-604, San Francisco, 1992. [ bib | .ps.gz ]

Alan A. Wrench, L. Laver, M. A. Jack, A. G. Robertson, D. S. Soutar, and J. Mackenzie Beck. Objective speech quality assessment in patients with intra-oral cancers: Voiceless fricative. In International Conference on Spoken Language Processing, volume 2, pages 1071-1074, Banff, Canada, 1992. [ bib ]

S. Renals, H. Bourlard, N. Morgan, H. Franco, and M. Cohen. Connectionist optimisation of tied mixture hidden Markov models. In J. E. Moody, S. J. Hanson, and R. P. Lippmann, editors, Advances in Neural Information Processing Systems, volume 4, pages 167-174. Morgan-Kaufmann, 1992. [ bib ]

Briony J. Williams. The design of a speech database for Welsh diphone extraction. In Proc. Institute of Acoustics, volume 14, 1992. [ bib ]

Paul A. Taylor. A phonetic model of English intonation. PhD thesis, University of Edinburgh, 1992. [ bib | .ps | .pdf ]

Paul A. Taylor and S. D. Isard. A new model of intonation for use with speech recognition and synthesis. In International Conference on Spoken Language Processing, Banff, Canada, 1992. [ bib | .ps | .pdf ]

Alan W. Black. A Situation Theoretic Approach to Computational Semantics. PhD thesis, University of Edinburgh, 1992. [ bib | .ps | .pdf ]

S. Renals, N. Morgan, M. Cohen, H. Franco, and H. Bourlard. Improving statistical speech recognition. In Proc. IJCNN, volume 2, pages 301-307, Baltimore MD, 1992. [ bib | .ps.gz ]

P. C. Bagshaw and Briony J. Williams. Criteria for labelling prosodic aspects of English speech. In Proc. International Conference on Spoken Language Processing, volume 2, pages 859-862, Banff, Canada, 1992. [ bib | .ps | .pdf ]

H. Bourlard, N. Morgan, C. Wooters, and S. Renals. CDNN: A context-dependent neural network for continuous speech recognition. In Proc IEEE ICASSP, pages 349-352, San Francisco, 1992. [ bib ]

Briony J. Williams. Welsh letter-to-sound rules for text-to-speech synthesis. In Proc. Institute of Acoustics, volume 14, 1992. [ bib ]

Mark E. Forsyth, A. M. Sutherland, J. A. Elliott, and M. A. Jack. HMM speaker verification with sparse training data on telephone quality speech. In Proceedings of the Fourth Australian International Conference on Speech Science and Technology, pages 67-72, Brisbane, Australia, 1992. [ bib ]