Seminar at Speech, Music and Hearing:

X-job seminar:

Voice User Interface as an alternative access point to web-based mobile services - An investigation using an end-to-end design approach

Daniel Åkerberg

Opponent: Miray Merkes


The problem presented in this thesis is to investigate the potential benefits and challenges of using a Voice User Interface as an additional access point to a service otherwise only available as a web-based cell phone client using a Graphical User Interface. The service used in this Master’s Thesis is a furniture advertisement service. Two different user groups are thought of as the potential end users of this Spoken Dialogue System. The first one consisting of people in rural areas in developing countries to whom a Spoken Dialogue System potentially could be the only access point to a web-based service and the second one consisting of more advanced cell phone users to whom using a Voice User Interface potentially could have specific advantages to using a Graphical User Interface.

An end-to-end design life-cycle approach is used as the overall method for investigation including all steps from designing a dialogue structure, conducting user tests to creating a working prototype. In the Master’s thesis the biggest emphasis is laid upon the two user tests conducted; one conducted in an early stage of the design process using the Wizard of Oz technique and the other one conducted on a working prototype.

In addition to investigating the benefits and challenges of using a Voice User Interface an open source speech recognition engine called Sphinx4 is used for the final prototype and shallowly evaluated.

Both tests produce some data that, after analysis, generates both interesting and useful results regarding weaknesses and strengths of using both the specific Spoken Dialogue System developed in this Master’s thesis as well as of using a Spoken Dialogue System in general. The most prominent strengths are about an ease of use, an ease of navigation and the speed of talking. The most prominent weaknesses are about an information overflow, a sub-standard system recognition accuracy and the dialogue being to slow. The two user tests have got a transaction success rate of 78% and 90% out of which the author draws the conclusion that both the designed dialogue structure as well as the final Spoken Dialogue System have got potential. It is also the authors opinion that, given the limitations of this Master’s Project, the results of the user tests show promise of being interesting to both user groups.

15:15 - 17:00
Friday June 5, 2009

The seminar is held in Fantum.

