Google's Universal Speech Model paves the way for an AI language model supporting 1,000 languages
Google's Bard Chatbot Competes with OpenAI's ChatGPT and Microsoft's Bing
7 March 2023
|
Kunal Tyagi
The Universal Speech Model (USM), which supports up to 1,000 distinct languages, is announced by Google.
USM is capable of adjusting to new languages and data since it has been pre-trained on 12 million hours of voice and 28 billion phrases of text, covering over 300 languages.
Response to Google's debut of Bard, a chatbot that competes with OpenAI's ChatGPT, has been muted.
The Universal Speech Model (USM), a crucial stage in the creation of an AI language model that can accommodate up to 1,000 distinct languages, has received further information from Google. With 2 billion parameters trained on 12 million hours of speech and 28 billion phrases of text in more than 300 languages, USM is a family of speech models. Presently, USM supports over 100 languages and will serve as the system's building block.
USM is mostly used to generate subtitles for YouTube videos and was created to comprehend hundreds of spoken languages. It might be difficult to design tools for languages with few available written examples as machine translation models need a lot of data to be trained. USM is capable of adjusting to new languages and data since it was pre-trained on a big corpus of data.
It is "extremely difficult to find the necessary training data" for some of these languages, according to Google, which is why less than 20 million people globally speak them. In a recent study, Google Research asserts that by fine-tuning the model's encoder on a smaller sample of labeled data, the AI model was able to distinguish under-represented languages.
Apply to Xartup Fellowship Program
Get ₹1.5 Crore Technical Funding
In its ambition to develop an AI model that can handle 1,000 spoken languages, Google said the new model is an "important first step," offering more inclusion to billions of people in underserved places across the world. The Google researchers stated, "We think USM's base model architecture and training pipeline represent a foundation on which we may build to expand speech modeling to the next 1,000 languages.
Employees have complained that the distribution of Google's Bard, a chatbot that competes with OpenAI's ChatGPT, has been "botched." The debut has gotten lukewarm reviews. In a company-wide email, Sundar Pichai requested that staff members test Bard and spend time using it. LaMDA, the underlying technology used by Bard, will be included in Google Search, according to Google. With a separate chatbox for conversational results based on online results, it will enable users to search for questions and obtain results based on SEO optimization.
Last year, Meta said that it had created the first AI model to translate 200 distinct languages. Since ChatGPT upended the market in recent months, Google has renewed its attention on AI. Google unveiled Bard, its chatbot, about the same time that Microsoft declared it would employ ChatGPT to improve its Bing search engine.