Machine Learning Times
Machine Learning Times
EXCLUSIVE HIGHLIGHTS
The Great AI Myth: These 3 Misconceptions Fuel It
 Originally published in Forbes, July 29, 2024. The hottest thing...
How to Sell a Machine Learning Project
 Originally published in Built In, February 6, 2024. Never...
The 3 Things You Need To Know About Predictive AI
 Originally published in Forbes, June 29, 2024. Some problems are...
Alphabet Uses AI To Rush First Responders To Disasters—Takeaways For Businesses
 Originally published in Forbes, July 7, 2024. The National Guard...
SHARE THIS:

1 year ago
Introducing Speech-to-Text, Text-to-Speech, and More for 1,100+ Languages

 
Originally published in Meta AI, May 22, 2023.

Equipping machines with the ability to recognize and produce speech can make information accessible to many more people, including those who rely entirely on voice to access information. However, producing good-quality machine learning models for these tasks requires large amounts of labeled data — in this case, many thousands of hours of audio, along with transcriptions. For most languages, this data simply does not exist. For example, existing speech recognition models only cover approximately 100 languages — a fraction of the 7,000+ known languages spoken on the planet. Even more concerning, nearly half of these languages are in danger of disappearing in our lifetime.

In the Massively Multilingual Speech (MMS) project, we overcome some of these challenges by combining wav2vec 2.0, our pioneering work in self-supervised learning, and a new dataset that provides labeled data for over 1,100 languages and unlabeled data for nearly 4,000 languages. Some of these, such as the Tatuyo language, have only a few hundred speakers, and for most of these languages, no prior speech technology exists. Our results show that the Massively Multilingual Speech models outperform existing models and cover 10 times as many languages. Meta is focused on multilinguality in general: For text, the NLLB project scaled multilingual translation to 200 languages, and the Massively Multilingual Speech project scales speech technology to many more languages.

To continue reading this article, click here.

15 thoughts on “Introducing Speech-to-Text, Text-to-Speech, and More for 1,100+ Languages

  1. Download the Vegas Sweeps Download App. It is an online platform that provides an enjoyable gaming experience with friends and family. Not only this you can also, compete against the players worldwide.

     
  2. Meta’s Massively Multilingual Speech (MMS) project is an exciting step toward preserving global linguistic diversity by creating speech models for over 1,100 languages. With tools like wav2vec 2.0, the project builds models for lesser-known languages, making technology accessible to a broader audience. It’s impressive how this project scales up similar to how Sonic Foods and Meals bring diverse flavors to their menu, catering to various tastes and communities. Just as Sonic delights with unique deals for everyone, MMS expands accessibility, giving voice technology to underrepresented languages and communities worldwide. Visit sonicmenuin to get complete details.

     
  3. Speech recognition technology is truly groundbreaking, especially when it aims to preserve endangered languages and make information accessible to all. It’s fascinating how thoughtful design can have such a lasting impact—whether it’s through advanced machine learning or interior design Dubai services that transform spaces into functional and inspiring environments. Just as speech models require meticulous data curation, interior design services rely on careful planning and creativity to meet diverse needs, ensuring each project speaks to the lifestyle and preferences of its users.

     

Leave a Reply