Contact




August

The August system was a conversational spoken dialogue system featuring an animated agent called August, whose persona was inspired by August Strindberg, the famous Swedish 19th century author.

The August project was initiated as a way to promote speech technology and KTH in connection with Stockholm being the Cultural Capital of Europe in 1998. The spoken dialogue system as well as the animated character were developed during the first half of 1998 and the system was available for the general public at the Culture Center in Stockholm, daily from August 1998 to March 1999. The construction of the system was partly motivated by the need to collect genuine speech data from people with little or no previous experience of spoken dialogue systems. A corpus of more than 10,000 utterances of spontaneous computer-directed speech was collected and empirical linguistic analyses were carried out. Acoustical, lexical and syntactical aspects of this data were examined. The August system was a conversational spoken dialogue system featuring an animated agent called August, whose persona was inspired by August Strindberg, the famous Swedish 19th century author. The August project was initiated as a way to promote speech technology and KTH in connection with Stockholm being the Cultural Capital of Europe in 1998. The spoken dialogue system as well as the animated character were developed during the first half of 1998 and the system was available for the general public at the Culture Center in Stockholm, daily from August 1998 to March 1999.

The research purpose of the project was to expose speech technology to the general public, and in this way get practical experience from moving a research system outside the lab environment, and at the same time collect data on how people might interact with animated agents. August could answer questions covering a number of topics, for example giving the location of restaurants in Stockholm, sharing facts about the author August Strindberg or exchange social utterances. The dialogues can be considered as quite shallow since the system primarily answered questions and only occasionally initiated one-level clarification sub-dialogues. This meant that the dialogues were user driven, which of course influenced the dialogue data collected.

August used facial gestures for a number of purposes in the dialogue (Lundeberg and Beskow 1999). He would typically raise his eyebrows early in the sentence followed by a small nod, and he would mark focal words and stressed syllables with eyebrow movements. To enhance the perceived reactivity of the system, a set of listening gestures and thinking gestures was used. When the user pressed the push-to-talk button, the agent immediately started a randomly selected listening gesture, e.g. raising the eyebrows. At the release of the push-to-talk button, the agent changed to a randomly selected thinking gesture, e.g. looking away from the user. In order to make the synthetic face appear less artificial, and to make the agent appear to be aware of the user's actions the agent changed the direction of the head and eyes according to the detected movements of an approaching user. This was accomplished by using a desktop video camera together with image analysis software (Öhman 1999).

A corpus of more than 10,000 utterances of spontaneous computer-directed speech was collected and empirical linguistic analyses were carried out. Acoustical, lexical and syntactical aspects of this data were examined.

Group: Speech Communication and Technology

Staff:
Joakim Gustafson
Linda Bell
Jonas Beskow
Rolf Carlson
Björn Granström
Erland Levin
Nikolaj Lindberg
Johan Liljencrants
Magnus Lundeberg
Kåre Sjölander
Nikko Ström
Eva-Lena Svensson
Tobias Öhman

Funding: NUTEK/VINNOVA

Duration: 1998-01-01 - 2000-12-01

Website: http://www.speech.kth.se/august/

Keywords: multimodal dialogue, public system, animated agent

Related publications:

2003

Bell, L. (2003). Linguistic adaptations in spoken human-computer dialogues. Empirical studies of user behavior. Doctoral dissertation. [pdf]

Beskow, J. (2003). Talking heads - Models and applications for multimodal speech synthesis. Doctoral dissertation, KTH.

2002

Beskow, J., Granström, B., & House, D. (2002). A multimodal speech synthesis tool applied to audio-visual prosody. In Keller, E., Bailly, G., Monaghan, A., Terken, J., & Huckvale, M. (Eds.), Improvements in Speech Synthesis (pp. 372-382). New York: John Wiley & Sons, Inc.

Granström, B., House, D., & Beskow, J. (2002). Speech and gestures for talking faces in conversational dialogue systems. In Granström, B., House, D., & Karlsson, I. (Eds.), Multimodality in language and speech systems (pp. 209-241). Dordrecht: Kluwer Academic Publishers.

Gustafson, J. (2002). Developing multimodal spoken dialogue systems. Empirical studies of spoken human-computer interaction. Doctoral dissertation, KTH. [pdf]

2000

Bell, L. (2000). Linguistic adaptations in spoken and multimodal dialogue systems. Licentiate dissertation, KTH.

Carlson, R., & House, D. (2000). Prosodic aspects of Swedish question words in computer-directed spontaneous speech.. In Nordic Prosody VIII.

Gustafson, J., & Bell, L. (2000). Speech Technology on Trial: Experiences from the August System.. Natural Language Engineering, 6(Special issue on Best Practice in Spoken Dialogue Systems). [pdf]

Lindberg, N. (2000). Data driven methods in natural language processing - Two applications. Licentiate dissertation, KTH/TMH.

Lindberg, N., & Gustafson, J. (2000). Example based shallow semantic analysis in the August spoken dialogue system. TMH-QPSR, 41(1), 039-044. [pdf]

Öhman, T. (2000). Vision in speech technology. Automatic measurements of visual speech and audiovisual intelligibility of synthetic and natural faces. Licentiate dissertation.

1999

Bell, L., & Gustafson, J. (1999). Interaction with an animated agent: an analysis of a Swedish database of spontaneous computer directed speech. In Proc of Eurospeech '99 (pp. 1143-1146). Budapest, Hungary. [pdf]

Bell, L., & Gustafson, J. (1999). Repetition and its phonetic realizations: Investigating a Swedish database of spontaneous computer directed speech. In Proceedings of ICPhS-99 (pp. 1221-1224). [pdf]

Bell, L., & Gustafson, J. (1999). Repetition in a Swedish database of spontaneous computer-directed speech.. In Andersson, R., Abelin, Å., Allwood, J., & Lindblad, P. (Eds.), Proc of Fonetik 99 (pp. 15-18).

Bell, L., & Gustafson, J. (1999). Utterance types in the August database.. In The Third Swedish Symposium on Multimodal Communication.

Bell, L., & Gustafson, J. (1999). Utterance types in the August System. In Proc from IDS '99. [pdf]

Granström, B., House, D., & Lundeberg, M. (1999). Eyebrow movements as a cue to prominence.. In The Third Swedish Symposium on Multimodal Communication.

Granström, B., House, D., & Lundeberg, M. (1999). Visual prominence in multimodal speech perception.. In Proc of Fonetik 99 (pp. 61-64).

Gustafson, J., Lindberg, N., & Lundeberg, M. (1999). The August spoken dialogue system. In Proc of Eurospeech 99 (pp. 1151-1154). [pdf]

Gustafson, J., Lindberg, N., & Lundeberg, M. (1999). The August spoken dialogue system.. In The Third Swedish Symposium on Multimodal Communication.

Gustafson, J., Lundeberg, M., & Liljencrants, J. (1999). Experiences from the development of August - a multimodal spoken dialogue system.. In Proc from IDS '99 (pp. 61-64). [pdf]

Lundeberg, M., & Beskow, J. (1999). Developing a 3D-agent for the August dialogue system. In Proc of AVSP 99. [pdf]

Öhman, T. (1999). A visual input module used in the August spoken dialogue system. TMH-QPSR, 40(1-2), 039-044. [html]







Published by: TMH, Speech, Music and Hearing
Webmaster, webmaster@speech.kth.se

Last updated: 2012-11-09