It is becoming more common to communicate with technology by voice. For example, when we go in the car and we ask the device to call someone and it makes a call automatically. Or when we ask our mobile about the directions or about the weather. The technology of converting what has been said orally into text is known as speech recognition and is becoming increasingly important.
There are already several popular tools for using voice technology: Google Assistant, Apple Siri, Microsoft Cortana, Amazon Alexa, etc. However, these tools can only be used in the larger languages. Big corporations decide which language to add based on their business interests. Most of the world"s languages are not profitable for them. As a result, they cannot be used.
In addition to language, commercial voice technologies have several other problems:
-
- They can only be used in the most widely used languages. The big companies behind them only look at the economic benefits and the small languages are not viable for them. If there is no alternative, basques will have to speak in Spanish, French or English to television, mobile phones and other devices. The same will happen to the hundreds of millions of speakers of other small languages around the world.
- They do not take into account the diversity of voices. Speaking a main language does not ensure that the device understands your voice. If you speak with an accent or if you are a woman you will have more difficulty, since historically most of the voices used to train these engines have been those of white middle-class men.
- Problems of privacy. What we tell the devices is sent to the servers. Although everyone denied it, thanks to the leaks of the staff it has been known that what we tell the voice assistants in some cases has been listened by people. Also very private conversations, such as those with the doctor.
- They are not free. We cannot adapt them to our needs and the voices and resources that have been used to create these technologies cannot be used, for example, to develop basque technology.
Common Voice wants to do things differently
Common Voice is a project created by the Mozilla Foundation. Mozilla develops several popular projects, including the Firefox browser and the Thunderbird email client. Mozilla’s goal is to “ensure that the Internet is a public, open and accessible resource for all.” They also created the Common Voice project to “contribute to make speech knowledge accessible for all”.
Common Voice has several advantages when compared to proprietary speech recognition technologies:
- They want to expand into as many languages as possible. All the languages of the world can participate. Of course, Basque is also involved.
- They take into account the diversity of voices. The goal is to achieve balance: different accents, male, female, young, old...
- Privacy is a priority. No one knows who recorded the audio. If you wish, you can participate anonymously or provide some basic information.
- The completed text and voice collections are available under free licenses. As a result, the sector will be democratized. Anyone can use them to develop their own tools. Even small businesses that would otherwise not have the opportunity to do so.
Common Voice in Basque
Mozilla started the Common Voice project in 2017. Thinking that it was a strategic project for Basque, the Librezale team took on the responsibility of adding our language to it and the following year they carried out two works to start Common Voice in Basque. On the one hand, they translated the project website so that the participants feel comfortable in their own language. On the other hand, there was the need to write 5,000 words. It should be noted that these phrases must be in public domain or under a Creative Commons Zero license equivalent to it.
The members of Librezale wrote more than 2,000 sentences by hand. Seeing that they were far from the necessary amount, they asked Argia (basque media outlet) for help, and the collection was completed with almost 4,000 phrases left in public property for this project, so we Basques surpassed this barrier of 5,000 minimum phrases required by Mozilla.
After being able to add Basque to the Common Voice project, Librezale made a call to the digital community work. Since then, in subsequent years, thousands of people have participated in the field, either on their own or participating in one of the several marathon recordings that have been made. Repeatedly reading the same sentences from a number of recordings does not improve the product. Because of this, more written phrases were needed, Librezale obtained them from the Wikipedia in Basque.
In the autumn of 2023, the Basque Government's Vice-Ministry of Language Policy decided to give impetus to the strategic project Gaitu.eus, which seeks free Basque voice technology. There was a significant increase in the number of hours recorded in Basque. The EITB also managed to provide written phrases to the Basque Government and there are currently about 160,000 phrases to be read by the public.
Common Voice in Basque: data
At the moment there are more than 660 hours of engraving, but much more is needed to make good quality speech recognition. The recordings must be validated after they have been made, but in the case of Basque only half of them have been validated by volunteers. Therefore, it is necessary to promote the work of ensuring that the recordings that have already been made are correctly read, since the validated ones are those that are later used for speech recognition. To date, more than 10,800 basques have given their voice to advance Common Voice.
If we look at the number of hours recorded, the basques are in the thirteenth position of Common Voice. Considering the number of speakers in Basque, we are doing well, but we are sure that together we can improve.
Strong minority languages in Common Voice
At the time of writing this article, in January 2025, Common Voice recordings are being made in 223 languages and 97 other languages are in preparation. The table shows that the first in the classification of the number of recorded hours is a stateless language: Catalan. In second place is English and in third place is Kinyaruanda. Although this language is official in Rwanda, it is rejected by Corporations in their tools and services.
In fact, in the top twenty places of the ranking there are many who are disadvantaged or marginalized as Kinyaruanda languages: Esperanto, Bengali, Swahili, Amazigh Kabili from North Africa, Luganda, Persian, Tamil, Thai and Uyghur. It is clear that several linguistic communities want to take advantage of the opportunity that is not normally given to them.
To participate in...
Taking part is very easy. Go to the website gaitu.eus and click on the “Click here and give your voice” button. The Common Voice website will appear. Here are the two main things you can do: Speak and record your voice, and listen and help validate others’ recordings.
Making recordings is very simple. Remember to press the recording start button before you start reading in a loud voice and stop when you’re done. When you complete five sentences, click the Send button.
There is also no mystery in validating recordings made by others. To listen to the phrase, press the play button, listen, and if the voice reads what it says in the text, press the Yes button, otherwise, press the No button.
Frequently Asked Questions
We will conclude by clarifying some of the doubts that the participants often have.
Do I need a studio or special infrastructure to make recordings?
No, nothing extra is needed. To make recordings, it is enough to have a mobile phone or a computer with an ordinary micro.
Should the sound quality be high?
No, I don"t. We need common voice recordings made with conventional devices. No absolute silence is required, but avoid excessive background noise.
Can I read in dialect?
No, I don"t. It has to be together at the moment. Natural accents, intonations and styles are welcome, but you have to read what is put in the text.
Phonological variants are allowed, for example, geology can be read as “geology” or “jeoloji”, or “stick” and “stick”, but no other changes are required. For example, people tend to say “I have” instead of “I have”. This should be avoided because it confuses artificial intelligence in training with this data. Therefore, if the recordings of others are validated, they should be considered erroneous.
The Importance of Media in Common Voice
As explained, the implementation of the project required thousands of written phrases on public property. Argia wanted to push the project forward and they made his contribution. Recently, EITB has made a significant contribution by transferring a number of phrases to the public domain. If other media outlets were also encouraged to contribute to the written sentences, they would make an invaluable contribution to this beautiful Common Voice project.
The basques together are capable of doing amazing things, it’s something we’ve shown more than once, and we’ll show it in the future. We basques don’t deserve to have speech knowledge in foreign languages, but in our own language. Thank you and let’s encourage each other in their own way to promote Common Voice in Basque.