Fall semester 2005
Nordic Graduate School of Language Technology
Graduate School
of Language Technology
Back
to Speech Technology 1 home page
A selection
of papers and other publications will be used as additional reading material
for each subtopic.
Most papers
can be found on the web. Some papers will be distributed.
Acoustic Phonetics
Liljencrants, J.:" Speech signal processing," in W Hardcastle & J Laver (editors) The Handbook of Phonetic
Sciences, Blackwell Publishers Ltd, Oxford 1997, 697-720 (Will be distributed)
Yates, G.:
"The ear as an acoustical transducer", Acoustics Australia, Vol. 21
1993, pp. 77-81(Will be distributed)
Lieberman,
P., Blumstein, S. (1988): parts of chapter 7 Speech physiology, speech
perception, and acoustic phonetics, Cambridge University Press, pp.
148-161(Will be distributed)
Bruce, G., B. Granström, K. Gustafson, M.
Horne, D. House, and P. Touati. 1997.‘On the analysis of prosody in interaction.’ In Y. Sagisaka,
Hirschberg
J., Communication
and prosody: Functional aspects of prosody , Speech Communication, Volume 36,
Issues 1-2, January 2002, Pages 31-43 (Will be distributed)
Speech
Synthesis
Carlson
R., Granström B.: "Speech Synthesis", Hardcastle & Laver (editors) The Handbook of Phonetic
Sciences, Blackwell Publishers Ltd,
Granström, B. "Multi-modal speech synthesis with
applications" G. Chollet, M. G. Di Benedetto, A. Esposito, M. Marinaro, (Eds) Speech
Processing, Recognition and Artificial Neural Network, Proceedings of the 3rd
International School on Neural Nets "Eduardo R. Caianiello"
Springer London 1999, pp. 327-346 (Will be distributed)
Klatt D.: "Review of text-to-speech conversion for English",
Journal of the Acoustical Society of America Vol.82 s 737-793, 1987 (Will be
distributed)
van Santen, J. , When will synthetic speech sound human: Role of rules and
data, In Proc of ICSLP 2000,
W. Black,
P. Taylor, and Caley R. The
Festival Speech Synthesis System, 1998.
http://www.cstr.ed.ac.uk/projects/festival/
Corpus-Based
Techniques In The At&T Nextgen Synthesis System, Icslp
2000,
http://www.research.att.com/projects/tts/pubs.html
Synthesis
examples:
http://www.ims.uni-stuttgart.de/~moehler/synthspeech/examples.html
http://www.naturalvoices.att.com/
Speech
Recognition
Lawrence
R. Rabiner (1989) A Tutorial on Hidden Markov Models
and Selected Applications in Speech Recognition, Proceedings of the IEEE,
vol 77, no. 2, pp. 257-286. http://www.caip.rutgers.edu/%7Elrr/Reprints/tutorial
on hmm and applications.pdf
S. Young
(1996). "Large Vocabulary Continuous Speech
Recognition." IEEE Signal Processing Magazine 13(5): 45-57. http://mi.eng.cam.ac.uk/~sjy/papers/youn96.ps.gz
Ronald
Rosenfeld (2000) Two decades of Statistical
Language Modeling: Where Do We Go From Here? Proceedings
of the IEEE, 88(8), (pdf)
Ingunn Amdal, Eric Fossler-Lussier
(2003) "Pronunciation variation modeling in
automatic speech recognition", Telektronikk,
vol. 99, no. 2 http://www.telenor.com/telektronikk/volumes/pdf/2.2003/Side_70-82.pdf
R.P. Lippman (1997) Speech recognition by
machines and humans, Speech Communication vol
22 no 1, pp 1-15 (pdf)
M Mohri, F Pereira, M Riley (2000) Weighted finite state
transducers in speech recognition, ISCA ITRW ASR2000,
Speaker
Verification
Gish, H. and Schmidt, M. (1994): "Text-independent speaker
identification", IEEE Signal Processing Magazine Oct. 94, pp. 18-32
(pdf)
S. Furui (1997): "Recent Advances in Speaker
Recognition", Pattern Recognition Letters, vol
18, pp 859-872. (pdf)
Douglas A.
Reynolds, Thomas F. Quatieri, Robert B. Dunn (2000):
"Speaker verification using adapted Gaussian mixture models", Digital
Signal Processing, vol. 10, no. 1-3, Jan-July 2000 (pdf)
Bimbot, F., Bonastre, J.-F., Fredouille,
C., Gravier, G., Magrin-Chagnolleau,
I., Meignier, S., Merlin, T., Ortega-García, J., Petrovska-Delacrétaz,
D., and Reynolds, D. (2004): "A Tutorial on Text-Independent Speaker
Verification", EURASIP Journal
on Applied Signal Processing, Hindawi Publishing
Corporation Vol. 2004, no 4, pp 432-451 (pdf)
Dialog
Systems
James Allen, Donna Byron, Myroslava Dzikovska, George
Ferguson, Lucian Galescu, and Amanda Stent, "Towards conversational human-computer
interaction," AI Magazine, 22(4), Winter 2001, pp. 27-37. http://www.cs.rochester.edu/research/trips/
Joakim Gustafson (2002). Developing multimodal spoken
dialogue systems. Empirical studies of spoken human-computer
interaction. Doctoral Thesis. Department of Speech,
Music and Hearing, KTH,
Harald Aust, Martin Oerder,
Frank Seide, Volker Steinbiss: The Philips automatic train timetable
information system, Speech Communication 17 (1995) (Will be distributed)
Chu-Carroll, J., "MIMIC: An adaptive mixed initiative spoken dialogue
system for information queries," in Proceedings of the 6th ACL Conference
on Applied Language Processing, (
http://acl.ldc.upenn.edu/A/A00/A00-1014.pdf
Jim Glass,
"Challenges for Spoken Dialogue Systems," Proc. 1999 IEEE ASRU
Workshop, Keystone, CO, December 1999. http://www.sls.csail.mit.edu/sls/publications/
Marilyn A. Walker, Candace A. Kamm, and Diane
J. Litman. Towards Developing General Models of
Usability with
Some web
pages on spoken dialogue systems
http://www.cs.cmu.edu/~dbohus/SDS/index.html
http://wwwhome.cs.utwente.nl/~schooten/vidiam/dialoguesystems/
http://www.cs.cmu.edu/~dgroup/
http://www.cs.cmu.edu/~dod/roundtable/
Reference
Literature (Not part of the course)
Acoustic phonetics, Kenneth N. Stevens. ISBN 0-262-19404-X
Allmän och svensk fonetik. Norstedts. Elert, Claes-Christian. 1995.
Handbook
of Phonetic Sciences (Ed WJ Hardcastle and J Laver)
Blackwell,
Spoken
Language Processing: A Guide to Theory, Algorithm and System Development, Huang
& Xuedong ISBN: 0-13-022616-5
Survey of
the State of the Art Human Language Technology http://cslu.cse.ogi.edu/HLTsurvey/HLTsurvey.html
Speech
Technology Magazine's NewsBlast http://www.speechtechmag.com/eletter/archives/
CTT -
Selection of conferences/workshops http://www.speech.kth.se/conferences/