Google SRE Prodcast

Google SRE Prodcast

Prodcast is Google's podcast about Site Reliability Engineering and production software.

Episodes

June 16, 2026 7 mins

Courtney Nash of The VOID discusses the role of human expertise in managing complex systems, and how SREs continue to bring critical value even as technology and AI evolve.

Listen
Watch
Mark as Played

John Allspaw discusses reliability with Prodcast host Steve McGhee at SREcon Americas 2026

Listen
Watch
Mark as Played
June 9, 2026 9 mins

John Allspaw joins Prodcast hosts Matt Siegler and Florian Rathgeber for a candid discussion of reliability topics at SREcon Americas 2026.

Listen
Watch
Mark as Played
June 2, 2026 7 mins

We speak with Ricard Bejarano about being an SRE at home, discussing Home Lab systems.

Listen
Watch
Mark as Played
May 26, 2026 23 mins

We sit down with Matt Zelesko, VP of SRE at Google, for a candid talk about how AI is changing SRE — and how it's not. 

Listen
Watch
Mark as Played
May 21, 2026 10 mins

Sam Anderson shares his experiences with burnout, and how to support yourself as a reliable system.  Sam provides guidance on how to deal with burnout, and some suggestions on how to avoid burnout through understanding yourself and finding the help and support you need.

Listen
Watch
Mark as Played

Crisis Engineer Mikey Dickerson joins us to talk about what constitutes a crisis. Mikey draws on his broad experience across industry and the public sector, as well as on work with his team of systems fixers.

Listen
Watch
Mark as Played

What's happening in the world of SRE and resilience engineering? Join us as we catch up with fellow podcast hosts Colette Alexander and Clint Byrum of the This Is Fine! podcast at SREcon in Seattle. 

Listen
Watch
Mark as Played

How do you introduce Site Reliability Engineering to an AI research lab, bringing concepts of scale to engineers who are at the leading edge of AI systems?

In the latest episode of The Prodcast, hosts Steve McGhee and Florian Rathgeber chat with Damion Yates, who helped establish the reliability engineering culture at Google DeepMind. Damion shares his journey of bringing scalable infrastructure to DeepMind, supporting massive mach...

Listen
Watch
Mark as Played

Join us for a discussion with Carla Geisser of Layer Aleph, a company focused on "crisis engineering". Carla distinguishes a crisis from a standard incident by noting that a crisis is novel and lacks a playbook. She outlines five criteria for a true crisis: fundamental surprise, broken critical functions, high visibility, a rigid deadline (unlike internal tech deadlines), and perception breakdown. Crises often arise in organization...

Listen
Watch
Mark as Played

This episode of the Prodcast tackles the challenges of maintaining AI safety and alignment in production. Guests Felipe Tiengo Ferreira and Parker Barnes join hosts Matt Siegler and Steve McGhee to discuss AI model safety, from examining content to emerging security risks. The discussion emphasizes the vital role of SREs in managing safety at scale, detailing multi-layered defenses, including system instructions, LLM classifiers, a...

Listen
Watch
Mark as Played

In this episode of the Prodcast, guest Shannon Brady speaks with hosts Jordan Greenberg and Florian Rathgeber about managing Google's vast fleet of internal devices. Shannon explains how Google's Linux platform uses core SRE principles—specifically testing, canarying, and monitoring—for weekly stage rollouts of its Debian-based distribution. Configuration is efficiently managed using Puppet to ensure the right setup for...

Listen
Watch
Mark as Played
January 21, 2026 29 mins

Curious about the real impact of AI on Site Reliability Engineering? In this episode of The Prodcast, Google SRE Denia del Cid breaks down how her team is leveraging AI to transform production workflows.

Denia details practical applications like early outage detection, incident similarity analysis, and toil reduction. She explains the critical importance of validating against "golden data sets" and keeping humans in the loop to bui...

Listen
Watch
Mark as Played

Join us on The Prodcast as we host Heather Adkins, leader of Google's Office of Cybersecurity Resilience, for a critical look at the future of digital defenses. We explore the intersection of SRE and security , unpacking the "Secure by Design" philosophy and the shared DNA of incident management.

Heather candidly discusses the rise of "Agentic AI hackers" and polymorphic malware , revealing how defenders c...

Listen
Watch
Mark as Played
January 7, 2026 38 mins

In this episode, we welcome Alex Hidalgo and Brian Singer of nobl9 to discuss Service Level Objectives (SLOs). Alex and Brian talk about how SLOs can establish a vernacular across industry verticals, leading to constructive conversations and a shared understanding of how to implement SRE practices. Join us for a lively discussion that ranges across SLO topics!

Listen
Watch
Mark as Played
December 16, 2025 33 mins

In this episode, Steph Hippo, Platform Engineering Director at Honeycomb, joins The Prodcast to discuss AI and SRE. 

Steph explains how observability helps us understand complex systems from their outputs, and provides a foundation for SRE to respond to system problems. This episode explains how AI and observability build a self-reinforcing loop. 
We also discuss how AI can detect and respond to certain classes of inc...

Listen
Watch
Mark as Played

In this special episode hosts Steve McGhee from the Google SRE Prodcast and Kaslin Fields from the Google Kubernetes Podcast, welcome Google Cloud Solutions Architect Ben Good to discuss platform engineering. Listeners can look forward to hearing about the role of Kubernetes as a tool for building platforms, how to create "golden paths" for developers, and the importance of observability and self-service in platform design. The con...

Listen
Watch
Mark as Played

Google Staff SRE Ramón Llamas and Google Software Engineer Swapnil Hariajoin our hosts to explore how AI agents are revolutionizing production management, from summarizing alerts and finding hidden errors to proactively preventing outages. Learn about the challenges of evaluating non-deterministic systems and the fascinating interplay between human expertise and emerging AI capabilities in ensuring robust and reliable infrastructur...

Listen
Watch
Mark as Played

This episode features Google Technical Program Manager (TPM) Karanveer Anand, who joins our hosts to discuss the unique role of TPMs in Site Reliability Engineering (SRE). The conversation highlights how SRE TPMs bridge the gap between technical details and business impact, managing complex projects with inter-team dependencies and ensuring system reliability, particularly in the rapidly evolving AI landscape.

Listen
Watch
Mark as Played

This episode discusses Systems Theoretic Process Analysis (STPA), a method for analyzing complex systems. Theo Klein, a Google SRE, and Jeffrey Snover, a Distinguished Engineer at Google, explain that STPA focuses on identifying how system accidents and losses occur due to a loss of control, rather than component failures. STPA helps identify design flaws early, even before code is written! The discussion highlights that STPA is a ...

Listen
Watch
Mark as Played

Popular Podcasts

    If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

    Hey Jonas!

    Hey Jonas! The official Jonas Brothers podcast. Hosted by Kevin, Joe, and Nick Jonas. It’s the Jonas Brothers you know... musicians, actors, and well, yes, brothers. Now, they’re sharing another side of themselves in the playful, intimate, and irreverent way only they can. Spend time with the Jonas Brothers here and stay a little bit longer for deep conversations like never before.

    Crime Junkie

    Does hearing about a true crime case always leave you scouring the internet for the truth behind the story? Dive into your next mystery with Crime Junkie. Every Monday, join your host Ashley Flowers as she unravels all the details of infamous and underreported true crime cases with her best friend Brit Prawat. From cold cases to missing persons and heroes in our community who seek justice, Crime Junkie is your destination for theories and stories you won’t hear anywhere else. Whether you're a seasoned true crime enthusiast or new to the genre, you'll find yourself on the edge of your seat awaiting a new episode every Monday. If you can never get enough true crime... Congratulations, you’ve found your people. Follow to join a community of Crime Junkies! Crime Junkie is presented by Audiochuck Media Company.

    The MeatEater Podcast

    Building on the belief that a deeper understanding of the natural world enriches all of our lives, host Steven Rinella brings an in-depth and relevant look at all outdoor topics including hunting, fishing, nature, conservation, and wild foods. Filled with humor, irreverence, and things that will surprise the hell out of you, each episode welcomes a diverse group of guests who add their own expertise to the vast world of the outdoors. Part of The MeatEater Podcast Network.

    The Global Story

    Where the world and America meet, with episodes each weekday. The world is changing. Decisions made in the US and by the second Trump administration are accelerating that change. But they are also a symptom of it. With Asma Khalid in DC, Tristan Redman in London, and the backing of the BBC’s international newsroom, The Global Story brings clarity to politics, business and foreign policy in a time of connection and disruption. Come and join us our live event. You can register for Castfest tickets here: https://www.bbc.co.uk/showsandtours/shows/castfest-2026

Advertise With Us
Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2026 iHeartMedia, Inc.

  • Help
  • Privacy Policy
  • Terms of Use
  • AdChoicesAd Choices