We have given human subjects the task to detect and correct errors in (poorly) recognised utterances. The amount of dialogue context and information from the recogniser (5-best list, sentence and word confidence scores) given to the subjects was varied systematically. The aim was to investigate which factors are important for error detection and correction.
The study indicates that the presence of speech recogniser confidence at word level and the 5-best list help the subjects to detect and correct recognition errors, and that a short context of about one utterance significantly increases their ability to detect, but not to correct, errors. A longer context, however, is not more helpful than a short one.
A round of tests took place in February and March 2003. 8 subjects judged 4*3*15 utterances each. The data will probably be presented here in time (we have to make it legible first), and results have been published in the proceedings of Robust 2004 (see Higgins Publications section). A PostScript version of a poster on the Guess experiment that was presented on the CTT meeting 2003-04-02 is available, as well as the PowerPoint teaser (in Swedish) that introduced the poster.
The recognised utterances we presented the subjects with were chosen from a corpus of Map-Task-like dialogues collected by Gabriel Skantze in a previous test, which is reported in Exploring Human Error Handling Strategies: Implications for Spoken Dialogue Systems (abstract).
Copyright © 2002-2004 Jens Edlund, Gabriel Skantze and the members of the Center for Speech Technology