Java-based Incremental Dialog Framework

Note: Jindigo has not been further developed for a while. The reason for this is that Gabriel Skantze is working on another dialogue system framework called IrisTK. It is designed to be much easier to use and has much better support for multi-modal face-to-face interaction.


An incremental dialog system process its input and output word-by-word instead of utterance-by-utterance. This allows, among other things, a more natural, rapid turn-taking.

Jindigo is a framework for developing and experimenting with incremental spoken dialog systems, developed at the Department of Speech Music and Hearing, KTH. Some features:

  • Java-based, platform independent
  • Open source
  • Modularized and extendible. Comes with a set of (all pure Java-based) built-in modules for building a complete spoken dialog system:
    • Speech recognizer (CMU Sphinx 4)
    • Parser & Semantic interpreter
    • Context modeler
    • Speech synthesizer (MaryTTS)
    • Inspector for monitoring and understanding how information flows between modules.
  • Incremental!
    • All modules and data-types are built an a general model of incremental processing.
    • Revision is fully supported in all modules, e.g. the speech recognizer might output an hypothesis and then later revoke it.
    • Of course, non-incremental systems can be implemented as well.

Video Example

This is an example video of a Jindigo application running:


You can test the Chess example application (shown above) yourself with Java Web Start:

  • Best: If you have have an English Windows OS (Vista or 7, with the latest .NET version), you can try using the native speech recognizer. [RUN Windows version]
  • Otherwise: You can run the system with the CMU Sphinx recognizer. This will be slower to start and the performance may vary a lot depending on your microphone and accent. [RUN Sphinx version]


Jindigo is still at a very early stage of development, but you can download a package with the latest binaries and source code from Sourceforge.

Note: Although Jindigo should in principle be platform independent, it has currently mostly been tested on Windows.

The distribution comes with the speech controlled chess board shown above. In order to run, you need a MaryTTS version 4.1 server running (as default on the same machine, but that can easily be configured).


Documentation has yet to be written. In the meantime, you can check out the (yet poorly annotated) Javadoc.


Source code can be checked out from the SVN repository at Sourceforge.


The Jindigo project is maintained by Gabriel Skantze. Email: gabriel@speech.kth.se


Schlangen, D., & Skantze, G. (2011). A General, Abstract Model of Incremental Dialogue Processing. Dialogue & Discourse, 2(1), 83-111. [pdf]

Schlangen, D., Baumann, T., Buschmeier, H., Buss, O., Kopp, S., Skantze, G., & Yaghoubzadeh, R. (2010). Middleware for Incremental Processing in Conversational Agents. In Proceedings of SigDial. Tokyo, Japan. [pdf]

Skantze, G. (2010). Jindigo: a Java-based Framework for Incremental Dialogue Systems. Technical Report, KTH, Stockholm, Sweden. [pdf]

Skantze, G., & Hjalmarsson, A. (2010). Towards Incremental Speech Generation in Dialogue Systems. In Proceedings of SIGdial (pp. 1-8). Tokyo, Japan. (*) [abstract] [pdf]

(*) Best Paper Award at SIGdial 2010

Schlangen, D., & Skantze, G. (2009). A general, abstract model of incremental dialogue processing. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09). Athens, Greece. [abstract] [pdf]