Data Engineering Podcast

Data Engineering Podcast

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Episodes

May 13, 2025 46 mins
Summary
In this episode of the Data Engineering Podcast Tulika Bhatt, a senior software engineer at Netflix, talks about her experiences with large-scale data processing and the future of data engineering technologies. Tulika shares her journey into the data engineering field, discussing her work at BlackRock and Verizon before joining Netflix, and explains the challenges and innovations involved in managing Netflix's impression dat...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Sida Shen, product manager at CelerData, talks about StarRocks, a high-performance analytical database. Sida discusses the inception of StarRocks, which was forked from Apache Doris in 2020 and evolved into a high-performance Lakehouse query engine. He explains the architectural design of StarRocks, highlighting its capabilities in handling high concurrency and low latency quer...
  • Summary
    In this episode of the Data Engineering Podcast Derek Collison, creator of NATS and CEO of Synadia, talks about the evolution and capabilities of NATS as a multi-paradigm connectivity layer for distributed applications. Derek discusses the challenges and solutions in building distributed systems, and highlights the unique features of NATS that differentiate it from other messaging systems. He delves into the architectural de...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Viktor Kessler, co-founder of Vakmo, talks about the architectural patterns in the lake house enabled by a fast and feature-rich Iceberg catalog. Viktor shares his journey from data warehouses to developing the open-source project, Lakekeeper, an Apache Iceberg REST catalog written in Rust that facilitates building lake houses with essential components like storage, compute, an...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Jeremy Edberg, CEO of DBOS, about durable execution and its impact on designing and implementing business logic for data systems. Jeremy explains how DBOS's serverless platform and orchestrator provide local resilience and reduce operational overhead, ensuring exactly-once execution in distributed systems through the use of the Transact library. He discusses the importance of v...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Roman Gershman, CTO and founder of Dragonfly DB, explores the development and impact of high-speed in-memory databases. Roman shares his experience creating a more efficient alternative to Redis, focusing on performance gains, scalability, and cost efficiency, while addressing limitations such as high throughput and low latency scenarios. He explains how Dragonfly DB solves ope...
    Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Sean Knapp, CEO of Ascend.io, explores the intersection of AI and data engineering. He discusses the evolution of data engineering and the role of AI in automating processes, alleviating burdens on data engineers, and enabling them to focus on complex tasks and innovation. The conversation covers the challenges and opportunities presented by AI, including the need for intellige...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Pete DeJoy, co-founder and product lead at Astronomer, talks about building and managing Airflow pipelines on Astronomer and the upcoming improvements in Airflow 3. Pete shares his journey into data engineering, discusses Astronomer's contributions to the Airflow project, and highlights the critical role of Airflow in powering operational data products. He covers the evolution ...
  • Summary
    In this episode of the Data Engineering Podcast Rajan Goyal, CEO and co-founder of Datapelago, talks about improving efficiencies in data processing by reimagining system architecture. Rajan explains the shift from hyperconverged to disaggregated and composable infrastructure, highlighting the importance of accelerated computing in modern data centers. He discusses the evolution from proprietary to open, composable stacks, e...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Gleb Mezhanskiy, CEO and co-founder of DataFold, talks about the intersection of AI and data engineering. He discusses the challenges and opportunities of integrating AI into data engineering, particularly using large language models (LLMs) to enhance productivity and reduce manual toil. The conversation covers the potential of AI to transform data engineering tasks, such as te...
    Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey from data engineering to MLOps and emphasizes the importance of data testing over software development in AI contexts. He discusses the types of data assets required for AI applications, including extensive test datasets, especially in generative AI, and explains the differences in data ...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Andrew Luo, CEO of OneSchema, talks about handling CSV data in business operations. Andrew shares his background in data engineering and CRM migration, which led to the creation of OneSchema, a platform designed to automate CSV imports and improve data validation processes. He discusses the challenges of working with CSVs, including inconsistent type representation, lack of sch...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Dan Bruckner, co-founder and CTO of Tamr, talks about the application of machine learning (ML) and artificial intelligence (AI) in master data management (MDM). Dan shares his journey from working at CERN to becoming a data expert and discusses the challenges of reconciling large-scale organizational data. He explains how data silos arise from independent teams and highlights t...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Lior Barak shares his insights on developing a three-year strategic vision for data management. He discusses the importance of having a strategic plan for data, highlighting the need for data teams to focus on impact rather than just enablement. He introduces the concept of a "data vision board" and explains how it can help organizations outline their strategic vision by consid...
  • Mark as Played
    Summary
    The core task of data engineering is managing the flows of data through an organization. In order to ensure those flows are executing on schedule and without error is the role of the data orchestrator. Which orchestration engine you choose impacts the ways that you architect the rest of your data platform. In this episode Hugo Lu shares his thoughts as the founder of an orchestration company on how to think about data orches...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast the inimitable Max Beauchemin talks about reusability in data pipelines. The conversation explores the "write everything twice" problem, where similar pipelines are built without code reuse, and discusses the challenges of managing different SQL dialects and relational databases. Max also touches on the evolving role of data engineers, drawing parallels with front-end engineeri...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast Sam Kleinman talks about the pivotal role of databases in software engineering. Sam shares his journey into the world of data and discusses the complexities of database selection, highlighting the trade-offs between different database architectures and how these choices affect system design, query performance, and the need for ETL processes. He emphasizes the importance of unde...
  • Mark as Played
    Summary
    In this episode of the Data Engineering Podcast, Anna Geller talks about the integration of code and UI-driven interfaces for data orchestration. Anna defines data orchestration as automating the coordination of workflow nodes that interact with data across various business functions, discussing how it goes beyond ETL and analytics to enable real-time data processing across different internal systems. She explores the challe...
  • Mark as Played

    In this episode, I had the pleasure of speaking with Ken Pickering, VP of Engineering at Going, about the intricacies of streaming data into a Trino and Iceberg lakehouse. Ken shared his journey from product engineering to becoming deeply involved in data-centric roles, highlighting his experiences in ecommerce and InsurTech. At Going, Ken leads the data platform team, focusing on finding travel deals for consumers, a task that inv...

    Mark as Played
    Summary
    The challenges of integrating all of the tools in the modern data stack has led to a new generation of tools that focus on a fully integrated workflow. At the same time, there have been many approaches to how much of the workflow is driven by code vs. not. Burak Karakan is of the opinion that a fully integrated workflow that is driven entirely by code offers a beneficial and productive means of generating useful analytical o...
  • Mark as Played

    Popular Podcasts

      In 1997, actress Kristin Davis’ life was forever changed when she took on the role of Charlotte York in Sex and the City. As we watched Carrie, Samantha, Miranda and Charlotte navigate relationships in NYC, the show helped push once unacceptable conversation topics out of the shadows and altered the narrative around women and sex. We all saw ourselves in them as they searched for fulfillment in life, sex and friendships. Now, Kristin Davis wants to connect with you, the fans, and share untold stories and all the behind the scenes. Together, with Kristin and special guests, what will begin with Sex and the City will evolve into talks about themes that are still so relevant today. "Are you a Charlotte?" is much more than just rewatching this beloved show, it brings the past and the present together as we talk with heart, humor and of course some optimism.

      Dateline NBC

      Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

      Stuff You Should Know

      If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

      Good Hang with Amy Poehler

      Come hang with Amy Poehler. Each week on her podcast, she'll welcome celebrities and fun people to her studio. They'll share stories about their careers, mutual friends, shared enthusiasms, and most importantly, what's been making them laugh. This podcast is not about trying to make you better or giving advice. Amy just wants to have a good time.

      The Clay Travis and Buck Sexton Show

      The Clay Travis and Buck Sexton Show. Clay Travis and Buck Sexton tackle the biggest stories in news, politics and current events with intelligence and humor. From the border crisis, to the madness of cancel culture and far-left missteps, Clay and Buck guide listeners through the latest headlines and hot topics with fun and entertaining conversations and opinions.

    Advertise With Us
    Music, radio and podcasts, all free. Listen online or download the iHeart App.

    Connect

    © 2025 iHeartMedia, Inc.