List of publications by Rolf Carlson, KTH, Speech, Music and Hearing

2022

Carlson, R., & Granström, B. (2022). Another half a century in speech research. In Unpublished. [pdf]

2011

Johnson-Roberson, M., Bohg, J., Skantze, G., Gustafson, J., Carlson, R., Rasolzadeh, B., & Kragic, D. (2011). Enhanced Visual Scene Understanding through Human-Robot Dialog. In IEEE/RSJ International Conference on Intelligent Robots and Systems. [pdf]

2010

Johnson-Roberson, M., Bohg, J., Kragic, D., Skantze, G., Gustafson, J., & Carlson, R. (2010). Enhanced Visual Scene Understanding through Human-Robot Dialog. In Proceedings of AAAI 2010 Fall Symposium: Dialog with Robots. Arlington, VA. [pdf]

2009

Andréasson, M., Borin, L., Forsberg, M., Beskow, J., Carlson, R., Edlund, J., Elenius, K., Hellmer, K., House, D., Merkel, M., Forsbom, E., Megyesi, B., Eriksson, A., & Strömqvist, S. (2009). Swedish CLARIN activities. In Domeij, R., Koskenniemi, K., Krauwer, S., Maegaard, B., Rögnvaldsson, E., & de Smedt, K. (Eds.), Proc. of the NODALIDA 2009 workshop Nordic Perspectives on the CLARIN Infrastructure of Language Resources (pp. 1-5). Northern European Association for Language Technology. [abstract] [pdf]Abstract: Although Sweden has yet to allocate funds speciﬁcally intended for CLARIN activities, there are some ongoing activities which are directly relevant to CLARIN, and which are explicitly linked to CLARIN. These activities have been funded by the Committee for Research Infrastructures and its subcommittee DISC (Database Infrastructure Committee) of the Swedish Research Council.Beskow, J., Carlson, R., Edlund, J., Granström, B., Heldner, M., Hjalmarsson, A., & Skantze, G. (2009). Multimodal Interaction Control. In Waibel, A., & Stiefelhagen, R. (Eds.), Computers in the Human Interaction Loop (pp. 143-158). Berlin/Heidelberg: Springer. [pdf]Boves, L., Carlson, R., Hinrichs, E., House, D., Krauwer, S., Lemnitzer, L., Vainio, M., & Wittenburg, P. (2009). Resources for Speech Research: Present and Future Infrastructure Needs. In Interspeech (pp. 1803-1806). Brighton, UK. [abstract] [pdf]Abstract: This paper introduces the EU-FP7 project CLARIN, a joint effort of over 150 institutions in Europe, aimed at the creation of a sustainable language resources and technology infrastructure for the humanities and social sciences research community. The paper briefly introduces the vision behind the project and how it relates to speech research with a focus on the contributions that CLARIN can and will make to research in spoken language processing.Carlson, R., & Gustafson, K. (2009). Exploring Data Driven Parametric Synthesis. In Fonetik 2009. Stockholm, Sweden. [pdf]Carlson, R., & Hirschberg, J. (2009). Cross-Cultural Perception of Discourse Phenomena. In Interspeech (pp. 1723-1726). Brighton, UK. [pdf]

2008

Biadsy, F., Rosenberg, A., Carlson, R., Hirschberg, J., & Strangert, E. (2008). A Cross-Cultural Comparison of American, Palestinian, and Swedish. In Speech Prosody 2008. Campinas, Brazil. [pdf]Carlson, R., Gustafson, K., & Strangert, E. (2008). Synthesising disfluencies in a dialogue system. In Nordic Prosody. Helsinki, Finland.

2007

Carlson, R. (2007). Conflicting acoustic cues in stop perception. In Where Do Features Come From ? - Phonological Primitives in the Brain, the Mouth, and the Ear (pp. 63-64). Paris, France. [pdf]Carlson, R. (2007). Using acoustic cues in stop perception. Proceedings of Fonetik, TMH-QPSR, 50(1), 25-28. [pdf]Carlson, R., & Granström, B. (2007). Rule-based Speech Synthesis. In Benesty, J., Sondhi, M. M., & Huang, Y. (Eds.), Springer Handbook of Speech Processing (pp. 429-436). Springer Berlin Heidelberg.Carlson, R., & Hawkins, S. (2007). When is fine phonetic detail a detail?. In ICPhS 2007 (pp. 211-214). Saarbrücken, Germany. [pdf]

2006

Carlson, R., Edlund, J., Heldner, M., Hjalmarsson, A., House, D., & Skantze, G. (2006). Towards human-like behaviour in spoken dialog systems. In Proceedings of Swedish Language Technology Conference (SLTC 2006). Gothenburg, Sweden. [pdf]Carlson, R., Gustafson, K., & Strangert, E. (2006). Cues for Hesitation in Speech Synthesis. In Proceedings of Interspeech 06. Pittsburgh, USA. [pdf]Carlson, R., Gustafson, K., & Strangert, E. (2006). Prosodic Cues for Hesitation. Dept. of Linguistics & Phonetics Working Papers, 52, 21–24.Carlson, R., Gustafsson, K., & Strangert, E. (2006). Modelling hesitation for synthesis of spontaneous speech. In Proceedings of Speech Prosody 2006. Dresden. [pdf]Heldner, M., Edlund, J., & Carlson, R. (2006). Interruption impossible. In Bruce, G., & Horne, M. (Eds.), Nordic Prosody, Proceedings of the IXth Conference, Lund 2004 (pp. 97-105). Frankfurt am Main, Germany. [abstract] [pdf]Abstract: Most current work on spoken human-computer interaction has so far concentrated on interactions between a single user and a dialogue system. The advent of ideas of the computer or dialogue system as a conversational partner in a group of humans, for example within the CHIL-project1 and elsewhere (e.g. Kirchhoff & Ostendorf, 2003), introduces new requirements on the capabilities of the dialogue system. Among other things, the computer as a participant in a multi-part conversation has to appreciate the human turn-taking system, in order to time its' own interjections appropriately. As the role of a conversational computer is likely to be to support human collaboration, rather than to guide or control it, it is particularly important that it does not interrupt or disturb the human participants. The ultimate goal of the work presented here is to predict suitable places for turn-takings, as well as positions where it is impossible for a conversational computer to interrupt without irritating the human interlocutors.Skantze, G., Edlund, J., & Carlson, R. (2006). Talking with Higgins: Research challenges in a spoken dialogue system. In André, E., Dybkjaer, L., Minker, W., Neumann, H., & Weber, M. (Eds.), Perception and Interactive Technologies (pp. 193-196). Berlin/Heidelberg: Springer. [abstract] [pdf]Abstract: This paper presents the current status of the research in the Higgins project and provides background for a demonstration of the spoken dialogue system implemented within the project. The project represents the latest development in the ongoing dialogue systems research at KTH. The practical goal of the project is to build collaborative conversational dialogue systems in which research issues such as error handling techniques can be tested empirically.

2005

Carlson, R., Hirschberg, J., & Swerts, M. (Eds.). (2005). Special Issue on Error handling in spoken dialogue systems. Speech Communication, 45(3).Carlson, R., & Granström, B. (2005). Data-driven multimodal synthesis. Speech Communication, 47(1-2), 182-193.Carlson, R., Hirschberg, J., & Swerts, M. (2005). Cues to upcoming Swedish prosodic boundaries: Subjective judgment studies and acoustic correlates. Speech Communication, 46, 326-333.

2004

Carlson, R., Elenius, K., & Swerts, M. (2004). Perceptual judgments of pitch range. In Bel, B., & Marlin, I. (Eds.), Proc. of Intl Conference on Speech Prosody 2004 (pp. 689-692). Nara, Japan. [pdf]Carlson, R., Hirschberg, J., & Swerts, M. (2004). Prediction of upcoming Swedish prosodic boundaries by Swedish and American listeners. In Bel, B., & Marlin, I. (Eds.), Proc of Intl Conference on Speech Prosody 2004 (pp. 329-332). Nara, Japan. [pdf]Edlund, J., Skantze, G., & Carlson, R. (2004). Higgins - a spoken dialogue system for investigating error handling techniques. In Proceedings of the International Conference on Spoken Language Processing, ICSLP 04 (pp. 229-231). Jeju, Korea. [abstract] [pdf]Abstract: In this paper, an overview of the Higgins project and the research within the project is presented. The project incorporates studies of error handling for spoken dialogue systems on several levels, from processing to dialogue level. A domain in which a range of different error types can be studied has been chosen: pedestrian navigation and guiding. Several data collections within Higgins have been analysed along with data from Higgins' predecessor, the AdApt system. The error handling research issues in the project are presented in light of these analyses.Lacerda, F., Sundberg, U., Carlson, R., & Holt, L. (2004). Modelling interactive language learning: a project presentation. In Proc of The XVIIth Swedish Phonetics Conference, Fonetik 2004 (pp. 60-63). Stockholm University. [pdf]Strangert, E., & Carlson, R. (2004). On the modelling and synthesis of conversational speech. In Bruce, G., & Horne, M. (Eds.), Nordic Prosody. Proceedings of the IXth Conference (pp. 255-264). Lund: Peter Lang: Frankfurt am Main.Öhlin, D., & Carlson, R. (2004). Data-driven formant synthesis. In Proc of the XVIIth Swedish Phonetics Conference, Fonetik 2004 (pp. 160-163). Stockholm University. [pdf]

2003

Carlson, R., & Swerts, M. (2003). Perceptually based prediction of upcoming prosodic breaks in spontaneous Swedish speech materials. In Proc of ICPhS, XV Intl Conference of Phonetic Sciences (pp. 79-82). Barcelona, Spain. [pdf]Carlson, R., & Swerts, M. (2003). Relating perceptual judgments of upcoming prosodic breaks to F0 features. In Proc of Fonetik 2003, Umeå University, Dept of Philosophy and Linguistics PHONUM 9 (pp. 181-184). [pdf]

2002

Carlson, R., Granström, B., Heldner, M., House, D., Megyesi, B., Strangert, E., & Swerts, M. (2002). Boundaries and groupings - the structuring of speech in different communicative situations: a description of the GROG project. In Proc of Fonetik 2002 (pp. 65-68). Stockholm.Carlson, R., Sigvardson, T., & Sjölander, A. (2002). Data-driven formant synthesis. Proceedings of Fonetik, TMH-QPSR, 44(1), 121-124. [pdf]Megyesi, B., & Carlson, R. (2002). Data-driven methods for building a Swedish Treebank. In Proceedings of the Swedish Treebank Symposium. Växjö University, Sweden. [pdf]

2000

Carlson, R., & House, D. (2000). Prosodic aspects of Swedish question words in computer-directed spontaneous speech.. In Nordic Prosody VIII. Gustafson, J., Bell, L., Beskow, J., Boye, J., Carlson, R., Edlund, J., Granström, B., House, D., & Wirén, M. (2000). AdApt - a multimodal conversational dialogue system in an apartment domain. In Yuan, B., Huang, T., & Tang, X. (Eds.), Proc. of ICSLP 2000, 6th Intl Conf on Spoken Language Processing (pp. 134-137). Beijing: China Military Friendship Publish. [abstract] [pdf]Abstract: A general overview of the AdApt project and the research that is performed within the project is presented. In this project various aspects of human-computer interaction in a multimodal conversational dialogue systems are investigated. The project will also include studies on the integration of user/system/dialogue dependent speech recognition and multimodal speech synthesis. A domain in which multimodal interaction is highly useful has been chosen, namely, finding available apartments in Stockholm. A Wizard-of-Oz data collection within this domain is also described.

1999

Gustafson, J., Sjölander, K., Beskow, J., Granström, B., & Carlson, R. (1999). Creating web-based exercises for spoken language technology. In Tutorial session in proceedings of IDS'99 (pp. 165-168). [pdf]Hunnicutt, S., Carlson, R., Carlberger, A., & Rosengren, E. (1999). TIDE-ENABL-projektet: Sofistikerad design med hjälp av talförståelse.. In Proc of Konferensen Människa-Handikapp-Livsvillkor Rendez-vous (pp. 161-162). Sjölander, K., Gustafson, J., Beskow, J., Granström, B., & Carlson, R. (1999). Web-based educational tools for speech technology. In Proceedings Proc MATISSE - ESCA/SOCRATES Workshop on Method and Tool Innovations for Speech Science Education. London.

1998

Bickley, C., Carlson, R., Cudd, P., Hunnicutt, S., Reimers, B., & Whiteside, S. (1998). Enabler for engineering software using language and speech. In Proc RESNA. Carlson, R., Granström, B., Gustafson, J., Lewin, E., & Sjölander, K. (1998). Hands-on speech technology on the Web. In Proceedings of Elsnet in Wonderland (pp. 30-36). Dybkjær, L., Bernsen, N-O., Carlson, R., Chase, L., Dahlbäck, N., Failenschmid, K., Heid, U., Heisterkamp, P., Jönsson, A., Kamp, H., Karlsson, I., van Kuppevelt, J., Lamel, L., Paroubek, P., & Williams, D. (1998). Comparative evaluation of letter-to-sound conversion techniques for English text-to-speech synthesis. In Proc of the First International Conference on Language Resources and Evaluation. Gustafson, J., Elmberg, P., Carlson, R., & Jönsson, A. (1998). An educational dialogue system with a user controllable dialogue manager. In Proc of ICSLP98, Intl Conference on Spoken Language Processing (pp. 33-36). Sydney, Australia. [pdf]Langlais, P., Öster, A-M., Carlson, R., & Granström, B. (1998). Automatic detection of mispronunciation in non-native Swedish speech. In Proc of STiLL98, ESCA-Workshop on Speech Technology in Language Learning (pp. 41-44). Marholmen, Sweden.Lee, T., Carlson, R., & Granström, B. (1998). Context-dependent duration modeling for continuous speech recognition. In Proc of ICSLP98, Intl Conference on Spoken Language Processing (pp. 2955-2958). Sydney, Australia.Sjölander, K., Beskow, J., Gustafson, J., Lewin, E., Carlson, R., & Granström, B. (1998). Web-based educational tools for speech technology. In Proc of ICSLP98, 5th Intl Conference on Spoken Language Processing (pp. 3217-3220). Sydney, Australia.

1997

Carlberger, A., Lewin, E., Nord, L., Rosengren, E., Ström, N., Carlson, R., & Hunnicutt, S. (1997). ENABL - access to design by speech recognition. In Bannert, R., Heldner, M., Sullivan, K., & Wretling, P. (Eds.), Proc of Fonetik -97, Dept of Phonetics, Umeå Univ., Phonum 4 (pp. 89-92). Carlson, R., & Granström, B. (1997). Dialogue management and multi-modal output in the Waxholm spoken dialogue system. In Luperfoy, S. (Ed.), Automated Spoken Dialog Systems. MIT Press.Carlson, R., & Granström, B. (1997). Speech synthesis. In Hardcastle, W. J., & Laver, J. (Eds.), The Handbook of Phonetic Science (pp. 768-788). Oxford: Blackwell Publ. Ltd.Gustafson, J., Larsson, A., Carlson, R., & Hellman, K. (1997). How do system questions influence lexical choices in user answers?. In Proc of Eurospeech '97, 5th European Conference on Speech Communication and Technology (pp. 2275-2278). Rhodes, Greece. [pdf]

1996

Carlson, R. (1996). The dialog component in the Waxholm system. In LuperFoy, S., Nijholt, A., & Veldhuijzen van Zanten, G. (Eds.), Proc of Twente Workshop on Language Technology. Dialogue Management in Natural Language Systems (TWLT 11) (pp. 209-218). [pdf]Carlson, R. (1996). The dialog component in the Waxholm system. TMH-QPSR, 37(2), 109-112. [pdf]Carlson, R., & Granström, B. (1996). The Waxholm spoken dialogue system. Phonetica Pragensia IX. Charisteria viro doctissimo Premysl Janota oblata. Acta Universitatis Carolinae Philologica 1, 1996. [pdf]Carlson, R., & Hunnicutt, S. (1996). Generic and domain-specific aspects of the Waxholm NLP and Dialog modules. In Proc of ICSLP-96, 4th Intl Conference on Spoken Language Processing (pp. 677-680). Philadelphia, USA. [pdf]

1995

Bertenstam, J., Beskow, J., Blomberg, M., Carlson, R., Elenius, K., Granström, B., Gustafson, J., Hunnicutt, S., Högberg, J., Lindell, R., Neovius, L., Nord, L., de Serpa-Leitao, A., & Ström, N. (1995). The Waxholm system - a progress report. In Dalsgaard, P. (Ed.), Proc of ESCA Workshop on Spoken Dialogue Systems (pp. 281-284). Vigs¿, Denmark. [pdf]Bertenstam, J., Blomberg, M., Carlson, R., Elenius, K., Granström, B., Gustafson, J., Hunnicutt, S., Högberg, J., Lindell, R., Neovius, L., de Serpa-Leitao, A., Nord, L., & Ström, N. (1995). The Waxholm application data-base. In Pardo, J. (Ed.), Proceednings Eurospeech 1995 (pp. 833-836). Madrid. [pdf]Bertenstam, J., Blomberg, M., Carlson, R., Elenius, K. O. E., Granström, B., Gustafson, J., Hunnicutt, S., Högberg, J., Lindell, R., Neovius, L., Nord, L., de Serpa-Leitao, A., & Ström, N. (1995). Spoken dialogue data collected in the Waxholm project. STL-QPSR, 36(1), 049-074. [pdf]Carlson, R., & Hunnicutt, S. (1995). The natural language component - STINA. STL-QPSR, 36(1), 029-048. [pdf]Carlson, R., Hunnicutt, S., & Gustafson, J. (1995). Dialogue management in the Waxholm system. In Dalsgaard, P. (Ed.), Proc of ESCA Workshop on Spoken Dialogue Systems (pp. 137-140). Vigs¿, Denmark. [pdf]

1994

Carlson, R. (1994). Models of speech synthesis. In Roe, D. E., & Wilpon, J. G. (Eds.), Voice Communication between Humans and Machines (pp. 116-134). Washington D.C: National Academy Press. [pdf]Carlson, R. (1994). Recent developments in the experimental WAXHOLM dialog system. In Proc. of the ARPA Human Language Technology Workshop (pp. 207-212). Princeton, New Jersey. [pdf]Carlson, R., & Granström, B. (1994). An interactive technique for matching speaker identity. In FONETIK «94, Working papers from the 8th Swedish Phonetics Conference (pp. 42-45). Lund, Sweden. [pdf]Carlson, R., & Hunnicutt, S. (1994). Dialog management in the Waxholm system. In FONETIK «94, Papers from the 8th Swedish Phonetics Conference (pp. 46-49). Lund, Sweden.

1993

Blomberg, M., & Carlson, R. (1993). Automatic labelling of speech given its text representation. In Papers from the Seventh Swedish Phonetics Conference, RUUL 23 (pp. 61-64). Dept. of Linguistics, Uppsala university, Uppsala. [pdf]Blomberg, M., & Carlson, R. (1993). Labeling of speech given its text representation. In Eurospeech '93, Berlin (pp. 1775-1778). Blomberg, M., Carlson, R., Elenius, K., Granström, B., Hunnicutt, S., Lindell, R., & Neovius, L. (1993). An experimental dialog system: WAXHOLM. In Proceedings of Seventh Swedish Phonetics Conference, RUUL 23 (pp. 49-52). Uppsala. [pdf]Blomberg, M., Carlson, R., Elenius, K., Gustafson, J., Granström, B., Hunnicutt, S., Lindell, R., & Neovius, L. (1993). An experimental dialogue system: WAXHOLM. In Proceedings Eurospeech '93 (pp. 1867-1870). Berlin.Blomberg, M., Carlson, R., Elenius, K. O. E., Granström, B., Gustafson, J., Hunnicutt, S., Lindell, R., Neovius, L., & Nord, L. (1993). An experimental dialogue system: Waxholm. STL-QPSR, 34(2-3), 015-020. [pdf]Carlson, R. (1993). Models of speech synthesis. STL-QPSR, 34(1), 001-014. [pdf]Carlson, R., Granström, B., & Nord, L. (1993). Synthesis experiments with mixed feelings - a progress report. In Proceedings Seventh Swedish Phonetics Conference, RUUL 23 (pp. 65-68). Uppsala.Carlson, R., & Nord, L. (1993). Vowel dynamics in a text-to-speech system - some considerations. In Proceedings Eurospeech '93 (pp. 1911-1914). Berlin. [pdf]Hunnicutt, S., Carlson, R., & Rilbe, L. (1993). A choice-based large-vocabulary predictive writing aid. In Proc. RESNA 16th Annual Conference (pp. 159-160). Las Vegas.

1992

Carlson, R. (1992). Synthesis: Modeling variability and constraints. Speech Comm, 11, 159-166. [pdf]Carlson, R., & Glass, J. (1992). Vowel classification based on analysis by synthesis. In Huber, D. (Ed.), Fonetik '92, the Sixth Swedish Phonetics Conference held in Gothenburg, Technical Report no. 12 (pp. 59-62). Chalmers University of Technology, Göteborg.Carlson, R., & Glass, J. (1992). Vowel classification based on analysis-by-synthesis. In Ohala, J. J., Nearey, T. M., Derwing, B. L., Hodge, M. M., & Wiebe, G. E. (Eds.), ICSLP 92 Proceedings (pp. 575-578). University of Alberta, Canada. [pdf]Carlson, R., & Glass, J. (1992). Vowel classification based on analysis-by-synthesis. STL-QPSR, 33(4), 017-027. [pdf]Carlson, R., Granström, B., Neovius, L., & Nord, L. (1992). The 'listening speed' paradigm for synthesis evaluation. In Huber, D. (Ed.), Fonetik '92, the Sixth Swedish Phonetics Conference held in Gothenburg, Technical Report no. 13 (pp. 63-66). Chalmers University of Technology, Göteborg. [pdf]Carlson, R., Granström, B., & Nord, L. (1992). An experiment to synthesize emotions. In Huber, D. (Ed.), Fonetik '92, the Sixth Swedish Phonetics Conference held in Gothenburg, Technical Report no. 12 (pp. 111-114). Chalmers University of Technology, Göteborg.Carlson, R., Granström, B., & Nord, L. (1992). Experiments with emotive speech - acted utterances and synthesized replicas. In Ohala, J. J., Nearey, T. M., Derwing, B. L., Hodge, M. M., & Wiebe, G. E. (Eds.), ICSLP 92 Proceedings (pp. 671-674). University of Alberta, Canada.Carlson, R., Granström, B., & Nord, L. (1992). Segmental evaluation using the Esprit/SAM test procedures and mono-syllabic words. In Bailly, G., Benoit, C., & Sawallis, T. R. (Eds.), Talking Machines: Theories, Models, & Designs (pp. 443-453). Elsevier Science Publ.Carlson, R., & Hunnicutt, S. (1992). STINA: A probabilistic parser for speech recognition. In Huber, D. (Ed.), Fonetik '92, the Sixth Swedish Phonetics Conference held in Gothenburg, Technical Report no. 12 (pp. 23-26). Chalmers University of Technology, Göteborg. [pdf]Manuel, S. Y., Shattuck-Hufnagel, S., Huffman, M., Stevens, K. N., Carlson, R., & Hunnicutt, S. (1992). Studies of vowel and consonant reduction. In Ohala, J. J., Nearey, T. M., Derwing, B. L., Hodge, M. M., & Wiebe, G. E. (Eds.), ICSLP 92 Proceedings (pp. 943-946). University of Alberta, Canada. [pdf]

1991

Bickley, C., Stevens, K., & Carlson, R. (1991). Synthesis of manner and voicing continua based on speech production models. In Proceedings of the 141th Meeting of the Acoustical Society of America. USA.Carlson, R. (1991). Duration models in use. In Proceedings of the XIIth ICPhS, XIIème Congress International des Sciences Phonetiques (pp. 278-281). Aix-en-Provence, France. [abstract] [pdf]Abstract: The main point in this paper is to describe how duration models actually are in use. Most obviously we find them applied in text-to-speech svstems. We also find that such models are slowly introduced in speech understanding systems. We will also discuss the notion of local speech tempo and the need to connect linguistic factors to low-level models. We will also discuss speaker-dependent parameters such as vowel-consonant ratio.Carlson, R. (1991). Some notes about research and development at KTH. In Proceedings of the fourth DARPA Speech and Natural Language Workshop (pp. 12-13). Pacific Grove, California, USA.Carlson, R. (1991). Synthesis: modeling variability and constraints. STL-QPSR, 32(4), 001-009. [pdf]Carlson, R. (1991). Synthesis: Modelling Variability and Constraints. In In EUROSPEECH-1991 (pp. 1043-1048). Genova, Italy. [pdf]Carlson, R., & Granström, B. (1991). Performance rules in a text-to-speech system. In Sundberg, J., Nord, L., & Carlson, R. (Eds.), Proceedings of the Music, Language, Speech and Brain, Wenner-Gren International Symposium Series (pp. 121-131). London.Carlson, R., & Granström, B. (1991). Speech synthesis development and phonetic research - a personal introduction. Journal of Phonetics, 19, 2-8.Carlson, R., Granström, B., & Hunnicutt, S. (1991). Multilingual text-to-speech development and applications. In Ainsworth, A. W. (Ed.), Advances in speech, hearing and language processing. London: JAI Press.Carlson, R., & Nord, L. (1991). Positional variants of some Swedish sonorants in an analysis-synthesis scheme. Journal of Phonetics, 19, 49-60.

1990

Blomberg, M., Carlson, R., Elenius, R., Granström, B., & Hunnicutt, S. (1990). Word recognition using synthesized templates. In Journal of The American Voice I/O Society (AVIOS), Vol. 8 (pp. 43-57). Carlson, R., Granström, B., Helgason, P., Jensen, P., & Thráinsson, H. (1990). An Icelandic text-to-speech system for the disabled. STL-QPSR, 31(4), 055-056. [pdf]Carlson, R., Granström, B., & Karlsson, I. (1990). Experiments with voice modeling in speech synthesis. Journal of Speech Communication, 10, 481-489.Carlson, R., Granström, B., & Karlsson, I. (1990). Experiments with voice modelling in speech synthesis. STL-QPSR, 31(2-3), 053-061. [pdf]Carlson, R., Granström, B., & Lindström, A. (1990). Methods to generate the pronunciation of proper names in Swedish. In Proceedings of the of Fonetik-90, Reports from the Department of Phonetics University of Umeå, Phonum. Umeå.Carlson, R., Granström, B., & Nord, L. (1990). Development and evaluation of a text-to-speech system. In Proceedings of the of Fonetik-90, Reports from the Department of Phonetics University of Umeå, Phonum (pp. 50-53). Umeå.Carlson, R., Granström, B., & Nord, L. (1990). Evaluation and development of the KTH text-to-speech system on the segmental level. Journal of Speech Communication, 9, 271-277.Carlson, R., Granström, B., & Nord, L. (1990). Evaluation and development of the KTH text-to-speech system on the segmental level. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (pp. 317-320). Albuquerque, New Mexico, USA.Carlson, R., Granström, B., & Nord, L. (1990). Segmental evaluation using the Esprit/SAM test procedures and monosyllabic words. In Bailly, G., & Benoit, C. (Eds.), Talking machines: Theories, Models and Applications. North-Holland Publishing Co.Carlson, R., Granström, B., & Nord, L. (1990). Segmental intelligibility of synthetic and natural speech in real and nonsense words. In Proceedings of the International Conference on Spoken Language Processing. Kobe, Japan.Carlson, R., Granström, B., & Nord, L. (1990). The KTH speech database. Journal of Speech Communication, 9, 375-380.Carlson, R., & Nord, L. (1990). Analysis and synthesis of Swedish sonorants. Part 2. In Proceedings of the of Fonetik-90, Reports from the Department of Phonetics University of Umeå, Phonum (pp. 70-73). Umeå.Carlson, R., & Nord, L. (1990). Cluster realizations in rule synthesis. In Proceedings of the ESCA Workshop on Speech Synthesis (pp. 9-12). Autrans, France.Magnuson, T., Granström, B., Carlson, R., & Karlsson, F. (1990). Phonetic transcription of a Swedish morphological analyzer. In Proceedings of the of Fonetik-90, Reports from the Department of Phonetics University of Umeå, Phonum. Umeå.Nord, L., Magnusson, T., Hunnicutt, S., & Carlson, R. (1990). Vowel sequences in speech. In Proceedings of the of Fonetik-90, Reports from the Department of Phonetics University of Umeå, Phonum (pp. 13-16). Umeå.

1989

Barber, S., Carlson, R., Cosi, P., Di Benedetto, M. G., Granström, B., & Vagges, K. (1989). A rule based Italian text-to-speech system. In Proceedings of the European Conference on Speech Communication and Technology, (pp. 517-520). Paris.Carlson, R., & Elenius, K. (1989). Assigning parts-of-speech to words from their orthography using a connectionist model. In Proceedings of the European Conference on Speech Communication and Technology, (pp. 534-537). Paris.Carlson, R., Fant, G., Gobl, C., Granström, B., Karlsson, I., & Lin, Q. (1989). Voice source rules for text-to-speech synthesis. In Proceedings of the International. Conference on Acoustics, Speech, and Signal Processing. Glasgow, Scotland.Carlson, R., Friberg, A., Frydén, L., Granström, B., & Sundberg, J. (1989). Speech and music performance: parallels and contrasts. Contemporary Music Review, 4, 389-402. [pdf]Carlson, R., & Granström, B. (1989). Duration models and segmental quality in a text-to-speech system. In Proceedings of the Meeting of the Acoustical Society of America. USA.Carlson, R., & Granström, B. (1989). Modeling duration for different text materials. In Proceedings of the European Conference on Speech Communication and Technology, (pp. 328-331). Paris.Carlson, R., & Granström, B. (1989). Modelling duration for different text materials. STL-QPSR, 30(1), 023-026. [pdf]Carlson, R., Granström, B., & Lindström, A. (1989). Predicting name pronunciation for a reverse directory service. In Proceedings of the European Conference on Speech Communication and Technology, (pp. 113-116). Paris.Carlson, R., & Nord, L. (1989). Positional variants of selected Swedish sonorants - part 1: analysis and synthesis of /r/. STL-QPSR, 30(1), 087-090. [pdf]Carlson, R., & Nord, L. (1989). Positional variants of Swedish sonorants in an analysis-synthesis scheme. In Proceedings of the European Conference on Speech Communication and Technology, (pp. 458-461). Paris.

1988

Blomberg, M., Carlson, R., Elenius, K., Granström, B., & Hunnicutt, S. (1988). Word recognition using synthesized reference templates. In Proc. Second Symposium on Advanced Man-Machine Interface Through Spoken Language (pp. 27-1 - 27-12). Hawaii, USA.Blomberg, M., Carlson, R., Elenius, K., Granström, B., & Hunnicutt, S. (1988). Word recognition using synthesized templates. In Proceedings of SPEECH '88, (pp. 1171-1178). Edinburgh.Blomberg, M., Carlson, R., Elenius, K. O. E., Granström, B., & Hunnicutt, S. (1988). Word recognition using synthesized templates. STL-QPSR, 29(2-3), 069-081. [pdf]Carlson, R., & Granström, B. (1988). Durational rules from a speech database. In Proceedings of the Meeting of the Acoustical Society of America. USA.Carlson, R., & Granström, B. (1988). Modeling different ways of speaking in a text-to-speech system. In Proceedings of the Working Papers 34, Dept. of Linguistics, Lund University (pp. 25-28). Lund.Carlson, R., Granström, B., & Hunnicutt, S. (1988). Applications of speech technology in aids for the disabled. In Proceedings of the of the Second Australian International. Conference on Speech Science and Technology (pp. 358-363). Sydney.Carlson, R., Granström, B., & Hunnicutt, S. (1988). RULSYS - The Swedish multilingual text-to-speech approach. In Proceedings of the of the Second Australian International. Conference on Speech Science and Technology (pp. 2-7). Sydney.

1987

Bladon, A., Carlson, R., Granström, B., Hunnicutt, S., & Karlsson, I. (1987). A text-to-speech system for British English, and issues of dialect and style. STL-QPSR, 28(2-3), 001-005. [pdf]Blomberg, M., Carlson, R., Elenius, K., Granström, B., & Hunnicutt, S. (1987). Taligenkänning baserad på ett text-till-talsystem. In Proceedings of TLH-Lund (pp. 18-19). Lund.Blomberg, M., Carlson, R., Elenius, K., Granström, B., Hunnicutt, S., Lindell, R., & Neovius, L. (1987). Speech recognition based on a text-to-speech synthesis system. In Laver, J., & Jack, M. A. (Eds.), European Conference on Speech Technology, Vol. II (pp. 369-372). Edinburgh.Carlson, R., Friberg, A., Frydén, L., Granström, B., & Sundberg, J. (1987). Speech and music performance. Parallels and contrasts. STL-QPSR, 28(4), 007-023. [pdf]Carlson, R., & Granström, B. (1987). Modeling Swedish segment duration. In Proceedings of the in Proceedings of the XIth ICPhS, (pp. 111-114). Tallinn.Carlson, R., & Granström, B. (1987). Swedish durational rules derived from a sentence database. In Proceedings of the Swedish Phonetics Conference, Dept. of Linguistics, (pp. 115-120). Uppsala.

1986

Blomberg, M., Carlson, R., Elenius, K., & Granström, B. (1986). Auditory models as front ends in speech-recognition systems. In Perkell, J. S., & Klatt, D. H. (Eds.), Invariance and Variability in Speech Processes,. Cambridge, MA, USA: Lawrence Erlbaum Ass. and Hillsdale, NJ.Blomberg, M., Carlson, R., Elenius, K., Granström, B., & Hunnicutt, S. (1986). Some current projects at KTH related to speech recognition. In International Workshop on Recent Advances and Applications of Speech Recognition. Rome.Blomberg, M., Carlson, R., Elenius, K. O. E., Galyas, K., Granström, B., Hunnicutt, S., & Neovius, L. (1986). Speech synthesis and recognition in technical aids. STL-QPSR, 27(4), 045-056. [pdf]Blomberg, M., Carlson, R., Elenius, K. O. E., Granström, B., & Hunnicutt, S. (1986). Some current projects at KTH related to speech recognition. STL-QPSR, 27(1), 031-040. [pdf]Carlson, R., Elenius, K., Granström, B., & Hunnicutt, S. (1986). Phonetic and orthographic properties of the basic vocabulary of five European languages. In Proc. ICASSP 86, Vol. 4 (pp. 2763-2766). Tokyo.Carlson, R., & Granström, B. (1986). A search for durational rules in a real-speech data base. Phonetica, 43, 140-154.Carlson, R., & Granström, B. (1986). Applications of a multi-lingual text-to-speech system for the visually impaired. In Proceedings of the Development of Electronic Aids for the Visually Impaired, (pp. 87-96). Martinus Nijhoff/Dr. W. Junk Publ. and Dordrecht.Carlson, R., & Granström, B. (1986). Linguistic processing in the KTH multi-lingual text-to-speech system. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, (pp. 2403-2406). Tokyo.Carlson, R., & Granström, B. (1986). Swedish durational rules derived from a sentence data base. STL-QPSR, 27(2-3), 013-025. [pdf]Carlson, R., & Granström, B. (1986). Swedish durational rules derived from a sentence database. In Proceedings of the Nordisk Prosodi IV. Odense, Denmark.

1985

Blomberg, M., Carlson, R., Elenius, K., & Granström, B. (1985). Speech research at KTH - two projects and technology transfer. In Forsberg, H. G., & Peterson, A. (Eds.), Proc. Speech-based information systems, May 1984 (pp. 57-66). Stockholm: IVA.Carlson, R., Elenius, K. O. E., Granström, B., & Hunnicutt, S. (1985). Phonetic and orthographic properties of the basic vocabulary of five European languages. STL-QPSR, 26(1), 063-094. [pdf]Carlson, R., & Granström, B. (1985). Rule controlled data base search. STL-QPSR, 26(4), 029-042. [pdf]Carlson, R., & Granström, B. (1985). Rule-controlled data base search. In Proceedings of the 110th Meeting of the Acoustical Society of America. USA.Carlson, R., Granström, B., Elenius, K., & Hunnicutt, S. (1985). Fonetiska och ortografiska egenskaper hos fem europeiska språk. In Proceedings of the Tal Ljud Hörsel conference (pp. 164). Göteborg.Carlson, R., Granström, B., & Hunnicutt, S. (1985). A parallel speech analyzing system. STL-QPSR, 26(1), 047-062. [pdf]Carlson, R., Granström, B., & Hunnicutt, S. (1985). Ett förslag till ett parallellt tal-analys-system. In Proceedings of the Tal Ljud Hörsel conference (pp. 164). Göteborg.

1984

Blomberg, M., Carlson, R., Elenius, K., & Granström, B. (1984). Auditory models in isolated word recognition. In Proceedings ICASSP 84. San Diego.Carlson, R., & Granström, B. (1984). Applications of a multi-lingual text-to-speech system for the visually impaired. In Proceedings of the Workshop on Rehabilitation of Visually Impaired. Florens, Italy.Carlson, R., & Granström, B. (1984). Text-to-speech conversion in telecommunications. Behavior and Information Technology, 3, 73-78.Magnusson, L., Blomberg, M., Carlson, R., Elenius, K., & Granström, B. (1984). Swedish Speech Researchers Team Up with Electronic Ventrure Capitalists. Speech Technology, 2(2), 15-24.

1983

Blomberg, M., Carlson, R., Elenius, K. O. E., & Granström, B. (1983). Auditory models and isolated word recognition. STL-QPSR, 24(4), 001-015. [pdf]Carlson, R., & Granström, B. (1983). Speech analysis based on auditory models. In Proceedings of the Tenth International. Congress of Phonetic Sciences (pp. 402). Utrecht, The Netherlands.Carlson, R., & Granström, B. (1983). Text-to-speech conversion in telecommunications. In Proceedings of the 10th International Symposium on Human Factors in Telecommunications (pp. 233-238). Helsinki.Carlson, R., & Granström, B. (1983). Units in text-to-speech systems. In Proceedings of the Tenth International. Congress of Phonetic Sciences (pp. 131-135). Utrecht, The Netherlands.Carlson, R., Granström, B., & Hunnicutt, S. (1983). Bliss symbol to speech converson: considerations of grammar. In Proceedings of the Tenth International Congress of Phonetic Sciences (pp. 381). Utrecht, The Netherlands.Carlson, R., Granström, B., & Hunnicutt, S. (1983). Bliss-to-speech: Grammatical considerations. In Proceedings of the Tal Ljud Hörsel conference (pp. 103). Stockholm.

1982

Blomberg, M., Carlson, R., Elenius, K., & Granström, B. (1982). Experiments with auditory models in speech recognition. In Carlson, R., & Granström, B. (Eds.), The Representation of Speech in the Peripheral Auditory System (pp. 197-201). Amsterdam: Elsevier Biomedical.Carlson, R., & Granström, B. (1982). Towards an auditory spectrograph. Representation of Speech in the Peripheral Auditory System,, 109-114.Carlson, R., Granström, B., & Hunnicutt, S. (1982). A multi-language text-to-speech module. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, (pp. 1604-1607). Paris.Carlson, R., Granström, B., & Hunnicutt, S. (1982). Bliss communication with speech or text output. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, (pp. 747-750). Paris.Hampshire, B., Rüden, J., Carlson, R., & Granström, B. (1982). Evaluation of centrally produced and distributed synthetic speech. STL-QPSR, 23(2-3), 018-023. [pdf]

1981

Carlson, R., Galyas, K., Granström, B., Hunnicutt, S., Larsson, B., & Neovius, L. (1981). A multi-language, portable text-to-speech system for the disabled. In Proceedings of the International Conf on Technology for Blissymbolics, J. of Biomedical Eng (pp. 285-288). Carlson, R., Galyas, K., Granström, B., Hunnicutt, S., Larsson, B., & Neovius, L. (1981). A multi-language, portable text-to-speech system for the disabled. STL-QPSR, 22(2-3), 008-016. [pdf]Carlson, R., Granström, B., & Hunnicutt, S. (1981). A multi-language text-to-speech module. STL-QPSR, 22(4), 018-028. [pdf]Carlson, R., Granström, B., & Hunnicutt, S. (1981). Bliss communication with speech or text output. STL-QPSR, 22(4), 029-038. [pdf]

1980

Carlson, R., Galyas, K., Granström, B., Pettersson, M., & Zachrisson, G. (1980). Speech synthesis for the non-vocal in training and communication. STL-QPSR, 21(1), 013-027. [pdf]Carlson, R., & Granström, B. (1980). Speech research in China - impressions from a visit. STL-QPSR, 21(4), 001-013. [pdf]Carlson, R., & Granström, B. (1980). Talanalys och talsyntes: Talande maskiner. Elteknik, 16.

1979

Allen, J., Hunnicutt, S., Carlson, R., & Granström, B. (1979). MITalk-79: The 1979 MIT text-to-speech system. In Proceedings of the 97th Meeting of the Acoustical Society of America (pp. 507-510). USA.Carlson, R., & Granström, B. (1979). Model predictions of vowel dissimilarity. STL-QPSR, 20(3-4), 084-104. [pdf]Carlson, R., Granström, B., & Klatt, D. (1979). Vowel perception: The relative perceptual salience of selected acoustic manipulations. STL-QPSR, 20(3-4), 073-083. [pdf]Carlson, R., Granström, B., & Klatt, D. H. (1979). Some notes on the perception of temporal patterns in speech. In Lindblom, B., & Öhman, S. (Eds.), Frontiers of Speech Communication Research (pp. 223-243). Academic Press, London.Carlson, R., Granström, B., & Klatt, D. H. (1979). Vowel perception: The relative perceptual salience of selected acoustic manipulations. In Proceedings of the 97th Meeting of the Acoustical Society of America (pp. 9-14). USA.Watanabe, A., Felicetti, S., Hedström, B., Surjadi, G., Tannergård, G., Tegerstedt, I., Wejnebring, B., Wetterling, M-B., Andersson, L., Hallsten, L., Kaunisto, M., Murray, T., Eriksson, H., Haapakorpi, M., Karlsson, I., Nord, L., Stålhammar, U., Elenius, K., Blomberg, M., Liljencrants, J., Carlson, R., Granström, B., Risberg, A., Spens, K-E., Agelförs, E., Boberg, G., Mártony, J., Tunblad, T., Öster, A-M., Galyas, K., Gauffin, J., de Serpa-Leitão, A., Askenfelt, A., Jansson, E., & Sunberg, J. (1979). Gunnar Fant 60 years. TMH-QPSR, 20(2), 1-45. [pdf]

1978

Carlson, R., & Granström, B. (1978). Experimental text-to-speech system for the handicapped. In Proceedings of the 96th Meeting of the Acoustical Society of America. USA.Carlson, R., & Granström, B. (1978). Talsyntes - ny teknik för handikappade. Information om rehabilitering, 3, 8-11.

1977

Carlson, R., & Granström, B. (1977). Perception and synthesis of speech. Doctoral dissertation, KTH/TMH.Carlson, R., & Granström, B. (1977). Perception and synthesis of speech. STL-QPSR, 18(1), 001-016. [pdf]

1976

Carlson, R., & Granström, B. (1976). A text-to-speech system based entirely on rules. In Proceedings of the IEEE International-Conference on ASSP. Philadelphia, PA, USA.Carlson, R., & Granström, B. (1976). Context sensitive rules for speech synthesis. In Proceedings of the 91st Meeting of the Acoustical Society of America. USA.Carlson, R., & Granström, B. (1976). Detectability of changes of level and spectral slope in vowels. In Proceedings of the "doktorandsymposium". Umeå.Carlson, R., & Granström, B. (1976). Detectability of changes of level and spectral slope in vowels. STL-QPSR, 17(2-3), 001-004. [pdf]Carlson, R., & Granström, B. (1976). Perception av segmentlängder. In Proceedings of the Dansk-svenska fonetiksymposiet, Uppsala universitet. Uppsala.Carlson, R., & Granström, B. (1976). Regelbeskrivning av svenskans uttal i ett text-till-talsystem. In Proceedings of the Nionde sammankomsten för svenskans beskrivning, in Nysvenska Studier. Göteborg.Carlson, R., & Granström, B. (1976). Temporal interaction in speech perception. In Proceedings of the 91st Meeting of the Acoustical Society of America. USA.Carlson, R., Granström, B., & Larsson, K. (1976). Evaluation of a text-to-speech system as a reading machine for the blind. STL-QPSR, 17(2-3), 009-013. [pdf]

1975

Carlson, R., Erikson, Y., Granström, B., Lindblom, B., & Rapp, K. (1975). Neutral and emphatic stress patterns in Swedish. In Proceedings of the Speech Communication Seminar. Stockholm: Almqvist and Wiksell.Carlson, R., Fant, G., & Granström, B. (1975). Two-formant models, pitch, and vowel perception. In Auditory Analysis and Perception of Speech. London: Academic Press. [pdf]Carlson, R., & Granström, B. (1975). A phonetically oriented programming language for rule description of speech. In Proceedings of the Speech Communication Seminar. Stockholm: Almqvist and Wiksell.Carlson, R., & Granström, B. (1975). A text-to-speech system based on a phonetically oriented programming language. STL-QPSR, 16(1), 017-026. [pdf]Carlson, R., & Granström, B. (1975). Ett regelbaserat text-till-talsystem. In Swedish Phonetics Conference, Stockholms universitet, Pilus. Stockholm.Carlson, R., & Granström, B. (1975). Perception of segmental duration. Structure and Process in Speech Perception, 90-106.Carlson, R., & Granström, B. (1975). Perception of segmental duration -. STL-QPSR, 16(1), 001-016. [pdf]Carlson, R., & Granström, B. (1975). Syntetiskt tal som alternativ till text. Teknisk Tidskrift, 1-2, 13-16.Carlson, R., & Granström, B. (1975). Text-to-speech conversion by ordered rules. In Proceedings of the 8th International Congress of Phonetic Sciences. Leeds, UK.Fant, G., Carlson, R., & Granström, B. (1975). The /e/-/o/ ambiguity. In Proceedings of the Speech Communication Seminar. Stockholm: Almqvist and Wiksell.Rothenberg, M., Carlson, R., Granström, B., & Lindqvist-Gauffin, J. (1975). A three-parameter voice source for speech synthesis. In Proceedings of the Speech Communication Seminar. Stockholm: Almqvist and Wiksell.

1974

Carlson, R., Granström, B., Fant, G., & Jismalm, G. (1974). Att tala med datorer: igenkänning och syntetiskt tal. In Proceedings of the Nordiska Ergonomisällskapets konf. and Människor och datorer. Stockholm.

1973

Carlson, R., & Granström, B. (1973). Perceptual evaluation of prosodic rules. In Swedish Phonetics Conference, Working Papers (pp. 71). Lund.Carlson, R., & Granström, B. (1973). Word accent, emphatic stress, and syntax in a synthesis rule scheme for Swedish. STL-QPSR, 14(2-3), 031-036. [pdf]

1972

Carlson, R., Granström, B., Lindblom, B., & Rapp-Holmgren, K. (1972). Some timing and fundamental frequency characteristics of Swedish sentences: Data, rules, and a perceptual evaluation. STL-QPSR, 13(4), 011-019. [pdf]Carlson, R., Granström, B., & Pauli, S. (1972). Perceptive evaluation of segmental cues. In Proceedings of the Conference on Speech Communication and Processing (pp. 206-209). Bedford, MA, USA.Carlson, R., Granström, B., & Pauli, S. (1972). Perceptive evaluation of segmental cues. STL-QPSR, 13(1), 018-024. [pdf]

1970

Carlson, R., Granström, B., & Fant, G. (1970). Some studies concerning perception of isolated vowels. STL-QPSR, 11(2-3), 019-035. [pdf]