SpeechTechnology1

GSLT: Speech Technology 1

List of literature

Fall semester 2002

Literature

Acoustic and Auditory Phonetics, Keith Johnson, ISBN# 0-631-20094-0

An Introduction to Text-To-Speech Synthesis, Thierry Dutoit, ISBN# 0-7923-7923-4498-7

Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Ben Gold & Nelson Morgan ISBN# 0-471-35154-7

Michael F McTear (2001) Spoken dialogue technology: enabling the conversational influence. Submitted to ACM Computing Surveys.
http://www.infj.ulst.ac.uk/~cbdg23/interests.html

Acoustic Phonetics

Liljencrants, J.:" Speech signal processing," i W Hardcastle & J Laver (editors) The Handbook of Phonetic Sciences, Blackwell Publishers Ltd, Oxford 1997, 697-720

Yates, G.: "The ear as an acoustical transducer", Acoustics Australia, Vol. 21 1993, pp. 77-81

Lieberman, P., Blumstein, S. (1988): parts of chapter 7 Speech physiology, speech perception, and acoustic phonetics, Cambridge University Press, pp. 148-161

Bruce, G., B. Granström, K. Gustafson, M. Horne, D. House, and P. Touati. 1997.‘On the analysis of prosody in interaction.’ In Y. Sagisaka, N. Campbell and N. Higuchi (eds.) Computing Prosody: Computational Models for Processing Spontaneous Speech, 43-59, Springer-Verlag, New York.

Speech Synthesis

Carlson R., Granström B.: "Speech Synthesis", Hardcastle & Laver (editors) The Handbook of Phonetic Sciences, Blackwell Publishers Ltd, Oxford 1997, 768-788

Granström, B. "Multi-modal speech synthesis with applications" G. Chollet, M. G. Di Benedetto, A. Esposito, M. Marinaro, (Eds) Speech Processing, Recognition and Artificial Neural Network, Proceedings of the 3rd International School on Neural Nets "Eduardo R. Caianiello" Springer London 1999, pp. 327-346

Klatt D.: "Review of text-to-speech conversion for English", Journal of the Acoustical Society of America Vol.82 s 737-793, 1987

van Santen, J. , When will synthetic speech sound human: Role of rules and data, In Proc of ICSLP 2000, Beijing

Möbius Bernd (2000). Corpus-based speech synthesis: methods and challenges. Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung (Univ. Stuttgart), AIMS 6 (4), 87-116
Avaliable at http://www.ims.uni-stuttgart.de/~moebius/pubs.shtml

W. Black, P. Taylor, and Caley R. The Festival Speech Synthesis System, 1998. Avaliable at http://www.cstr.ed.ac.uk/projects/festival/

Corpus-Based Techniques In The At&T Nextgen Synthesis System
http://www.research.att.com/projects/tts/pubs.html

Synthesis examples:
http://www.ims.uni-stuttgart.de/~moehler/synthspeech/examples.html
http://www.naturalvoices.att.com/

Speech Recognition

Mats Blomberg och Kjell Elenius (1997) Automatisk igenkänning av tal

Lawrence R. Rabiner (1989) A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proc of the IEEE, vol 77, no. 2, pp. 257-286.

F. Jelinek (1991) UP FROM TRIGRAMS! The struggle for improved language models, urospeech 91,Genua, vol. 3 pp 1037-40.

R.P. Lippman (1997) Speech recognition by machines and humans, Speech Communication vol 22 no 1, pp 1-15

M Mohr, F Pereira, M Riley (2000) Weighted finite state transducers in speech recognition, ISCA ITRW ASR2000, Paris

Speaker Verification

Gish, H. and Schmidt, M.: "Text-independent speaker identification", IEEE Signal Processing Magazine Oct. 94, pp. 18-32

Melin, H.: "Speaker verification in telecommunication" bidrag till Talteknologidagen, KTH, 7 nov 1996

G. Doddington Speaker Recognition Evaluation Methodology - An Overview and Perspective Proc. RLA2C Avignon 1998

S. Furui Recent Advances in Speaker Recognition Proc. Proc. AVBPA 1997 pp 237-252.

C. Champod & D. Meuwly The Inference of Identity in Forensic Speaker Recognition RLA2C Avignon 1998

Dialog Systems

Carlson, R., Granström, B.: " The WAXHOLM spoken dialogue system", Acta universitatis Carolinae philologica 1 1996, pp. 39-52

Gustavson, J., Lindberg, N., Lundeberg, M.: "The August spoken dialogue system", Proc. Eurospeech'99, Budapest, Ungern, 1999

Gustafson, J, Bell, L, Beskow, J, Boye, J, Carlson, R, Edlund, J, Granström, B, House, D & Wirén M (2000) AdApt - a multimodal onversational dialogue system in an apartment domain, In Proc of ICSLP 2000, Beijing, 2:134-137 http://www.speech.kth.se/ctt/publications/papers/icslp00_adapt.htm

J. Glass, "Challenges for Spoken Dialogue Systems," Proc. 1999 IEEE ASRU Workshop, Keystone, CO, December 1999. http://www.sls.lcs.mit.edu/sls/publications/index.shtml - Papers

Harald Aust, Martin Oerder, Frank Seide, Volker Steinbiss: The Philips automatic train timetable information system, Speech Communication 17 (1995)

Marilyn A. Walker, Candace A. Kamm, and Diane J. Litman. Towards Developing General Models of Usability with PARADISE. In Natural Language Engineering, to appear.

Chu-Carroll, J., "MIMIC: An adaptive mixed initiative spoken dialogue system for information queries," in Proceedings of the 6th ACL Conference on Applied Language Processing, (Seattle, WA, USA), May 2000.
http://acl.ldc.upenn.edu/A/A00/a00-1014.pdf

Referens Literature (Not part of the course)

Acoustic phonetics, Kenneth N. Stevens. ISBN 0-262-19404-X

Allmän och svensk fonetik. Norstedts. Elert, Claes-Christian. 1995.

Handbook of Phonetic Sciences (Ed WJ Hardcastle and J Laver) Blackwell, Oxford [ISBN 0- 631-18848-7]

Speech Synthesis and Recognition, by John Holmes and Wendy Holmes ISBN 0-7484-0856-8 (hardback) ISBN 0-7484-0857-6 (paperback)

Lawrence Rabiner & Biing-Hwang Juang, Fundamentals of Speech Recognition

Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Huang & Xuedong ISBN: 0-13-022616-5

Survey of the State of the Art Human Language Technology http://cslu.cse.ogi.edu/HLTsurvey/HLTsurvey.html