Chosen Architecture

We chose do develop our own, very simple architecture to get a lightweight, simple to use system which was not limited by commercial licensing constraints. Some aspects of the architecture were inspired by SRI's Open Agent Architecture.

One guideline for the development was simplicity. Reasons for this were to minimize development time, make it simple to use, and hopefully to achieve robustness through lower complexity.

The communication was chosen to use only text (no binary data) to achieve the platform independence, and facilitate debugging.

We also wanted servers to be restarted automatically when needed, or if they had crashed.

This last requirement lead to the use of a central server, named the Broker and similar to the OAA blackboard. The Broker routes calls between modules. It has knowledge of which servers exist, and how to start them.

All other programs connect to this server using the Transaction Control Protocol (TCP). The use of text communication and TCP has made it very easy to implement broker components in the scripting language TCL. It is possible for a person to connect to the broker with Telnet and manually interact with the broker modules.


The main disadvantage of the chosen architecture is that it has a central server that all communication must go through. If two modules are run in the United States for example, but interface through a Broker in Europe, all the communication must cross the Atlantic twice.

We think that this will not be a problem for our implementations, since all but one module (the speech technology servers) in most systems will be run at one location (our department) and remote clients will not need to communicatie with each other, but only with our department.

It is also possible for modules to agree to communicate directly with each other, bypassing the broker. This is done, for instance, with binary data such as the speech input to the recognizer. The speech recognizer opens a server socket every time an utterance is to be recognized, and the client sends it directly to the recognizer. This is neccessary since the Broker does not handle binary data, and is only intended for small messages.

Having one central server also influences the robustness of the system. If the Broker or the network connection to the Broker fails, the entire system will go down. However, it is thought that the simplicity of the Broker will enable it to be stable over long periods of time. Since all our servers are located at one place, a network failure to our servers would render the system inoperable even if the modules communicated directly with each other in a more distributed fashion.

The system is vulnerable to security attacks - all communication is unencrypted and unsigned. Anyone can through the debugging facilities observe the traffic between the modules.

Because the system is used in a research setting, we see no motive for attacking the system.

Servers requiring higher security may implement security services (digital signatures, encryption, etc) themselves, without involving the Broker. Such peer-to-peer security is often preferable to link level security, as the servers do not need to trust the Broker.

Previous: Existing Technology, Up: The Broker, Next: The Protocol

© Copyright 1997 Erland Lewin and the members of the Center for Speech Technology