An Infrastructure for Swedish language technology

Research and development on language technology needs an infrastructure of publicly available and standardized basic resources.

Such resources can be data or programs to process and use the data. Such basic resources are called "Basic Language Resource Kit", or BLARK. This project is a part of a national venture to develop infrastructure for Swedish language technology which is strongly supported by the langauge technology community in Sweden. A BLARK has to be created for each language separately. For Swedish, there are several resources, but it is unclear of what type they are, and if they are available. Therefore, we need to make an inventory and describe the existing language resources. Also, it is necessary to invent the need of such resources for future development. The goal of our ongoing work is to prepare for the creation of an infrastructure for Swedish language technology. To make the Swedish BLARK as useful as possible, it is of great importance that everybody working with Swedish language technology participates in the inventory process. Inthe future, we plan to survey Finnish, Jiddisch, Meänkieli, Romani chib, and Sami, which are official languages in Sweden. The project will be carried out in three phases. In a first phase, we have collected information about language resources from existing institutions, industry and individuals who use language technology in their work, and investigated their needs for Swedish language resources. In a next phase, we have defined which resources should be part of a Swedish BLARK, and listed the specific resources as well as the needs. Lastly, we apply for support from funding agencies to build up these language resources. This project is a part of the larger project called "An Infrastructure for Swedish language technology". It is carried out in co-operation with KTH, and Göteborg and Linköping University, and co-ordinated by Lars Borin at Göteborg University.

Group: Speech Communication and Technology

Kjell Elenius (Project leader)
Rolf Carlson

Funding: VR

Duration: 2007-01-01 - 2008-12-31


Related publications:


Elenius, K., Forsbom, E., & Megyesi, B. (2008). Survey on Swedish Language Resources. Technical Report, Speech, Music and Hearing, KTH and Department of Linguistics and Philology, Uppsala University. [pdf]

