Responsible teacher for this lab is Gabriel Skantze. If you need any help with this lab, please send an email to gabriel@speech.kth.se, or come by with your laptop to my office.
You will experiment with and extend an existing VoiceXML application. The application is a very simple travel booking system. The application is not connected to any database, so the reservation is never stored. You will use a commercial platform called Voxeo Evolution.
Download booking.xml (the VoiceXML application) and grammar.xml (the grammar of what can be said) and place them on a web server, for example where you have your personal web site or in your public Dropbox folder. Check them and see if you understand their structure.
Goto Voxeo Evolution and register for an account.
Create a new application (Development, Europe, VoiceXML, Premium ASR/TTS, LON - Prophecy 11 VoiceMXL). Let the Voice URL point to the web address where you put booking.xml.
When you have created the application, you can see how to call the application under Contact Methods. You can either call the landline number (starting with 08), or use a SIP client, such as Blink.
Try to call a few times and say different things. Note that initially, only trips between Paris and London can be performed. Try to fill one slot at a time or several slots at once. Check what can be said in the grammar (which is initially very limited). Examples:
If the voice keeps interrupting itself, it is probably picking its own voice or some noise. Try to use a headset and/or lower the volume of of the speaker and/or microphone.
Use a text editor of your choice, preferably an XML-aware editor such as EditiX Lite, Eclipse or Microsoft Visual Studio . The files are linked to DTD files which can be used to check that the syntax is correct. If you use an XML editor that does syntax checking (i.e. that the document is both well-formed and validates to the DTD), you will avoid doing a lot of the most common errors, since the Voxeo platform is not very helpful when it comes to debugging.
If everything works, you are ready for some exercises!
If the system doesn’t understand, it just says ”Sorry, I don’t understand”. If the user is silent, it says “I’m sorry I can’t hear you”.
The grammar is limited. The user can only travel between Paris and London. Time can only be stated in hours, like “five o’clock”.
After the booking is done, it is confirmed by the system, but the user has no chance of correcting it.
The speech synthesis quality could possibly be improved.
Write a report about the lab. Include the final vxml and grammar files. Discuss the following issues:
In your discussion, relate to the contents in the lectures and possibly the literature.
The report should be at least 2 pages (normal formatting), excluding the XML code.