Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters

Joakim Gustafson, Éva Székely and Jonas Beskow

The annual conference on Intelligent Virtual Agents (IVA'23), Wurzburg, Germany

Paper in PDF

An example flight reservation dialogue generated with GPT-4 and read by Furhat robot using the KTH multimodal TTS with controllable effort

10 word puns and cynical comments generated with GPT-4 and read by the Furhat robot
using the KTH conversational TTS with style and prosody control and the KTH lipsync with controllable articulatory effort;
or using the Amazon Polly Matthew neural TTS and the built-in Furhat lipsync.


Word

Conversational TTS

Commercial TTS

Fail-forward

Crisitunity

Snarktastic

Ejectocouch

Procrastinatron 3000

Awkward-Silence Filler

Schrodinger's Socks

Hummus Sapien

ChronoComical Cat

Philosopher's Stoned

AcciDelight Cake