Ten days after Google announced the integration of ten new African languages into its automatic translator Google Translate, the American company has put them online. Wolof and Fula are not concerned.
24 new languages have been added by Google to its automatic translator Google Translate. Among them, 10 are African. An important novelty, because it will allow the translation of entire sites into Lingala, Twi, Tigrinya, Sepedi, Oromo, Bambara, Luganda, Krio, Luganda and Tsonga. Or, more simply, to learn local languages for natives of several African countries.
According to Google project manager Isaac Caswell, one of the goals of these additions is "to support native languages, often overlooked by technology." “Until a few years ago, it was not technologically possible to add languages like these, which are called 'weak resource'. Which means there aren't a lot of textual resources,” says the engineer.
Indeed, it is thanks to the new Zero-shot Neural Machine Translation (NMT) artificial intelligence technology that translation software can now support languages little used on the Internet, and little documented in general. We notice it with Google Translate, however, the translations improve especially with the participation of the users. Linguists note many faults with the 24 new languages translated, of course. But even for neo-Latin languages, the automatic translation process is not yet complete either. Still, Google Translate was put online in 2006, and since then, translation has improved a lot.
Fula and Wolof, Google's forgotten greats
But Google's choice of selected African languages poses a problem. Because since the invention by Google of Zero-shot NMT, the problem of the written sources of a language no longer arises. Lingala, Luganda, Bambara, Oromo, Ewe and Twi are widely spoken languages in Africa and on the internet. But on the other hand, Tsonga and Krio have far fewer users than Fula or Wolof.
In absolute terms, native West African languages are among the few languages for which there is no automatic translator. If the objective criteria for which a language is chosen by Google are demographics and connectivity, Fula and Wolof seem to be ignored by linguists and developers.
Beyond the African languages added by Google Translate, some supported Asian and Latin American languages only have a few thousand users. Sanskrit, for example, is spoken by only 20 people in one part of India. Dhivehi, the language of the Maldives, is spoken by only 000 people. Other languages chosen by Google only have 300-000 million native users. And most are not written languages, and lack as much – probably more – resources than Fula and Wolof. Among the 2 languages added by Google Translate, too, 3 languages represent, cumulatively, 24 million practitioners, against 15 million for Fula alone.
As for connectivity, a Harvard University database shows that 62% of Fula-speaking users use the internet. For Wolof, this rate rises to 80% and represents 16,5 million people.