If the name data2vec sounds familiar, that’s probably because it made quite a splash on social and even traditional media when it came out, about two months ago. It’s an important entry in what is now a growing list of strategies that are focused on creating individual machine learning architectures that handle many different data types, like text, image and speech.
Most self-supervised learning techniques involve getting a model to take some input data (say, an image or a piece of text) and mask out certain components of those inputs (say by blacking out pixels or words) in order to get the models to predict those masked out components.
That “filling in the blanks” task is hard enough to force AIs to learn facts about their data that generalize well, but it also means training models to perform tasks that are very different depending on the input data type. Filling in blacked out pixels is quite different from filling in blanks in a sentence, for example.
So what if there was a way to come up with one task that we could use to train machine learning models on any kind of data? That’s where data2vec comes in.
For this episode of the podcast, I’m joined by Alexei Baevski, a researcher at Meta AI one of the creators of data2vec. In addition to data2vec, Alexei has been involved in quite a bit of pioneering work on text and speech models, including wav2vec, Facebook’s widely publicized unsupervised speech model. Alexei joined me to talk about how data2vec works and what’s next for that research direction, as well as the future of multi-modal learning.
***
Intro music:
- Artist: Ron Gelinas
- Track Title: Daybreak Chill Blend (original mix)
- Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Stuff You Should Know
If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.
Dateline NBC
Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com
Betrayal: Weekly
Betrayal Weekly is back for a brand new season. Every Thursday, Betrayal Weekly shares first-hand accounts of broken trust, shocking deceptions, and the trail of destruction they leave behind. Hosted by Andrea Gunning, this weekly ongoing series digs into real-life stories of betrayal and the aftermath. From stories of double lives to dark discoveries, these are cautionary tales and accounts of resilience against all odds. From the producers of the critically acclaimed Betrayal series, Betrayal Weekly drops new episodes every Thursday. Please join our Substack for additional exclusive content, curated book recommendations and community discussions. Sign up FREE by clicking this link Beyond Betrayal Substack. Join our community dedicated to truth, resilience and healing. Your voice matters! Be a part of our Betrayal journey on Substack. And make sure to check out Seasons 1-4 of Betrayal, along with Betrayal Weekly Season 1.