An open source project that will draw on crowdsourced contributions to make Wikipedia more accessible by adding text-to-speech synthesis that will enable users of the online encyclopedia to have portions of the text read out to them.

We will build an open framework where any open source speech synthesizer can be plugged in. Since it is open source modules, it will also be possible to add or substitute certain modules in the Text-to-Speech system (TTS)

The crowdsourcing element will entail wiki users either being able to report badly sounding sentences, or to correct the sentences themselves — although that will require some linguistic knowledge as it will involve using a phonetic transcription to correct the dictionary. A version of the text to speech platform will be developed in Swedish first, with a “basic English voice” following, and finally a plan to do a “proof of concept” version in Arabic.

The Wikispeech pilot project is a collaboration between KTH, the Swedish Post and Telecom Authority, Wikimedia Sweden and STTS. The group has conducted a pilot study already. And Wikimedia Sweden, which initiated the project, estimates that a quarter of all Wikipedia users — or nearly 125 million people per month — “need or prefer” text in spoken form, whether for literacy or visual impairment reasons.


Joakim Gustafson (Project leader)
Jonas Beskow
Jens Edlund
Zofia Malisz

Funding: PTS - Post och Telestyrelsen

Duration: 2016 - 2018

Related publications:

Published by: TMH, Speech, Music and Hearing

Last updated: 2012-11-09