All Episodes

May 18, 2022 51 mins

There’s a website called thispersondoesnotexist.com. When you visit it, you’re confronted by a high-resolution, photorealistic AI-generated picture of a human face. As the website’s name suggests, there’s no human being on the face of the earth who looks quite like the person staring back at you on the page.

Each of those generated pictures are a piece of data that captures so much of the essence of what it means to look like a human being. And yet they do so without telling you anything whatsoever about any particular person. In that sense, it’s fully anonymous human face data.

That’s impressive enough, and it speaks to how far generative image models have come over the last decade. But what if we could do the same for any kind of data?

What if I could generate an anonymized set of medical records or financial transaction data that captures all of the latent relationships buried  in a private dataset, without the risk of leaking sensitive information about real people? That’s the mission of Alex Watson, the Chief Product Officer and co-founder of Gretel AI, where he works on unlocking value hidden in sensitive datasets in ways that preserve privacy.

What I realized talking to Alex was that synthetic data is about much more than ensuring privacy. As you’ll see over the course of the conversation, we may well be heading for a world where most data can benefit from augmentation via data synthesis — where synthetic data brings privacy value almost as a side-effect of enriching ground truth data with context imported from the wider world.

Alex joined me to talk about data privacy, data synthesis, and what could be the very strange future of the data lifecycle on this episode of the TDS podcast.

***

Intro music:

- Artist: Ron Gelinas

- Track Title: Daybreak Chill Blend (original mix)

- Link to Track: https://youtu.be/d8Y2sKIgFWc

***

Chapters:

  • 2:40 What is synthetic data?
  • 6:45 Large language models
  • 11:30 Preventing data leakage
  • 18:00 Generative versus downstream models
  • 24:10 De-biasing and fairness
  • 30:45 Using synthetic data
  • 35:00 People consuming the data
  • 41:00 Spotting correlations in the data
  • 47:45 Generalization of different ML algorithms
  • 51:15 Wrap-up
Mark as Played

Advertise With Us

Popular Podcasts

Stuff You Should Know
Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Betrayal: Weekly

Betrayal: Weekly

Betrayal Weekly is back for a brand new season. Every Thursday, Betrayal Weekly shares first-hand accounts of broken trust, shocking deceptions, and the trail of destruction they leave behind. Hosted by Andrea Gunning, this weekly ongoing series digs into real-life stories of betrayal and the aftermath. From stories of double lives to dark discoveries, these are cautionary tales and accounts of resilience against all odds. From the producers of the critically acclaimed Betrayal series, Betrayal Weekly drops new episodes every Thursday. Please join our Substack for additional exclusive content, curated book recommendations and community discussions. Sign up FREE by clicking this link Beyond Betrayal Substack. Join our community dedicated to truth, resilience and healing. Your voice matters! Be a part of our Betrayal journey on Substack. And make sure to check out Seasons 1-4 of Betrayal, along with Betrayal Weekly Season 1.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.