Sber’s unique neural network can create texts in 61 languages

The SberDevices team, part of the SberBank ecosystem, has announced the launch of a multilingual version of the GPT-3 neural network – a model called mGPT can generate texts in 61 languages of the world, including languages of the peoples of Russia and the CIS.

As the press service notes, mGPT is the world’s first generative model that supports that many languages. It is available in two versions: a basic one, with 1.3 billion parameters, published publicly in the SberDisk cloud storage, and an extended one, with 13 billion parameters, which will soon be available on the ML Space machine learning platform from SberCloud.

The mGPT model can be used simply for text generation as well as for a variety of natural language processing tasks in one of the supported languages through pre-learning or as part of ensembles of models.

For example, an automated system can be taught to answer questions, determine the emotional coloring of text, extract names, surnames, company names, and the like from text. The model can also be used as a component of various speech technologies – for example, to improve the quality of speech recognition, the generation of scripts for dialog systems, and so on.

SberDevices head Denis Filippov said:

In 2020, we introduced a Russian-language version of the GPT-3 neural network, which is used in two virtual assistants of Sber’s Salyut family – Joy and Athena. We continued to develop our NLP-technologies and introduced the mGPT model, which supports more than 60 languages, for many of them there were no generative models before. Among other things, this will be our contribution to the preservation and development of the languages of the peoples of Russia: mGPT can generate texts in Tatar or Yakut, for example.

Full list of languages available in the mGPT model: Azerbaijani, English, Arabic, Armenian, Afrikaans, Basque, Bashkir, Belarusian, Bengali, Burmese, Bulgarian, Buryat, Hungarian, Dutch, Greek, Georgian, Danish, Hebrew, Indonesian, Spanish, Italian, Yoruba, Kazakh, Kalmykian, Kyrgyz, Chinese, Korean, Latvian, Vietnamese, Lithuanian, Malayali, Malayalam, Marathi, Moldovan, Mongolian, German, Ossetian, Persian, Polish, Portuguese, Romanian, Russian, Swahili, Tajik, Thai, Tamil, Tatar, Telugu, Tuvan, Turkish, Turkmen, Uzbek, Ukrainian, Urdu, Finnish, French, Hindi, Chuvash, Swedish, Yakut, Japanese.

Previous Post

It’s as if the Xiaomi Mi Band 3 was asking $150 in 2022. Garmin vivosmart 5 activity tracker introduced

Next Post

The company will remove all call recording apps from Google Play

Related Posts