Google SRE Prodcast

Google SRE Prodcast

Prodcast is Google's podcast about Site Reliability Engineering and production software.

Episodes

June 25, 2025 41 mins

In this episode, hosts Steve McGhee and Matt Siegler are joined by guest, Adam Fletcher, CEO and Co-Founder of MarketStreet. They discuss the current state of web development with LLMs, managing technical debt in startups, the evolution of infrastructure and reliability engineering, the role of community in technology, and the future of software engineering with AI.

Mark as Played

In this episode, Sal Furino, Customer Reliability Engineer at Bloomberg, discusses all things Service Level Objectives (SLOs) with hosts Steve McGhee and Matt Siegler. Together, they dig into what successful SLOs look like, how it relates to users, and how SLOs provide an effective framework for joint decisions about system reliability across product, engineering, and leadership teams.

Mark as Played

Matt Zelesko, the head of Site Reliability Engineering at Google, discusses the evolution of SRE, highlighting the shift from traditional operations to a model that balances velocity and reliability to better serve the rapid advancements in AI and ML. He emphasizes that SRE's core mission is to enable partners to move quickly while meeting reliability goals, and that the sheer scale of Google's infrastructure necessitates the SRE m...

Mark as Played
June 4, 2025 43 mins

In this Google Prodcast episode, Todd Underwood, a reliability expert from Anthropic with experience at Google and OpenAI, discusses the current state and future of AI in SRE. Todd and the hosts focus on the current state and future of AI and ML in production, particularly for SREs. Topics discussed include the challenges of AI-Ops, limitations of current anomaly detection, the potential for AI in config authoring and troubleshooti...

Mark as Played

This episode features guest, Peter Pellerzi (Distinguished Engineer, Google). Peter and the hosts, Matt Siegler and Steve McGhee, focus on the physical infrastructure side of SRE, discussing topics such as the scale of Google's data centers, handling incidents like power outages, testing and preparedness strategies, the use of AI for optimizing cooling plants, and more. Peter also emphasizes the importance of community support, pro...

Mark as Played

Jessica Theodat (Senior SRE & Security Tech Lead, Google) joins hosts Jordan Greenberg and Steve McGhee to discuss the intersection of security and site reliability engineering at Google. Jessica touches on risk management, the unique nature of security incident responses, and the shared goals between security and SRE. The crew also delves into the balance between security and SRE, acknowledging the tension and the need for collabo...

Mark as Played
April 16, 2025 15 mins

In this "bumpisode", hosts and producers of Prodcast (including our new co-host, Matt Siegler!) reflect on the previous season and introduce the new season's focus on upcoming trends in Site Reliability Engineering (SRE) and AI, and the friends we make along the way. They also introduce new elements we are bringing in with Season 4, such as a video format and a feedback form.

Mark as Played

This episode features Javi Beltran, a Google engineering lead who created the "Telebot" theme song. With our beloved hosts, Steve McGhee and Jordan Greenberg, Beltran discusses the origins of the song, created in 2012 for Google's paging system. The song was meant to add a touch of levity to what could be a stressful situation for engineers on-call. Beltran also unveils a new, more modern remix of “Telebot” (created in collaboratio...

Mark as Played

In this episode of the Prodcast, guests Dominic Hutton (Staff SRE, HashiCorp) and Niccolo' Cascarano (Senior Staff SRE at Google) join hosts Steve McGhee and Jordan Greenberg to dive into configurations. They discuss the differences between imperative and declarative configuration, explore the benefits and challenges of each approach, and the need for careful consideration when choosing between the two. Ultimately, the goal is to a...

Mark as Played

This episode features Casey Rosenthal (Founder, Cirrusly.ai) and John Allspaw (Founder and Principal, Adaptive Capacity Labs), joining our hosts Steve McGhee and Jordan Greenberg. Together they discuss how resilience appears in Software Engineering and SRE and explore the importance of understanding the human factors involved in adapting to system failures—highlighting the need for a more qualitative and holistic approach to unders...

Mark as Played

In this episode of the Prodcast, we are joined by guests Christina Schulman (Staff SRE, Google) and Dr. Laura Maguire (Principal Engineer, Trace Cognitive Engineering). They emphasize the human element of SRE and the importance of fostering a culture of collaboration, learning, and resilience in managing complex systems. They touch upon topics such as the need for diverse perspectives and collaboration in incident response, the nec...

Mark as Played

In this episode, Cody Smith (CTO and Co-founder, Camus Energy) & Trisha Weir (SRE Department Lead, Google) join hosts Steve McGhee and Jordan Greenberg, to discuss their experience developing Maglev, a highly available and distributed network load balancer (NLB) that is an integral part of the cloud architecture that manages traffic that comes in to a datacenter. Starting with Maglev’s humble beginnings as a skunkworks effort, Cody...

Mark as Played

In this episode, guests Narayan Desai (Principal SRE, Google) and Pat Somaru (Senior Production Engineer, Meta) join hosts Steve McGhee and Florian Rathgeber to discuss the challenges of observability and working with profiling data. The discussion covers intriguing topics like noise reduction, workload modeling, and the need for better tools and techniques to handle high-cardinality data.

Mark as Played

This episode features Google engineers Wilmer van der Gaast (Production on-tall) and Andy Sykes (Senior Staff Systems Engineer, SRE), joining hosts Steve McGhee and Jordan Greenberg, to discuss the development and maintenance of Google Public DNS (8.8.8.8). They highlight the initial motivations for creating the service, technical challenges like cache poisoning and load balancing, as well as the collaborative effort between SRE an...

Mark as Played

Guests Jordan Chernev (Senior Technology Executive) and Scott Bowers (SRE, Gearbox Software) who hail from the retail and gaming industries, respectively, join hosts Steve McGhee and Jordan Greenberg  to discuss the unique challenges of Site Reliability Engineering in their industries. They sha...

Mark as Played

Sarah Butt (Principal Engineer, Centralized Incident Response, Salesforce) and Vrai Stacey (Staff Software Engineer, Google) join hosts Steve McGhee and Jordan Greenberg to dive into incident response—particularly tooling and software for reliability incidents. Tune in for an in-depth discussion on topics such as the importance of communication and collaboration during incidents, and the role of tooling in supporting incident respo...

Mark as Played

Silvia Botros (SRE Architect, Twilio | Author of "High Performance MySQL, 4th edition”) and Niall Murphy (Co-founder & CEO, Stanza) join hosts Steve McGhee and Jordan Greenberg, to discuss cultural shifts in database engineering, rate limiting, load shedding, holistic approaches to reliability, proactive measures to build customer trust, and much more!

Mark as Played

Liz Fong-Jones (former Google SRE and current Field CTO at honeycomb.io) joins hosts Steve McGhee and Jordan Greenberg for a lively discussion centered around observability, its evolution from monitoring, and its role in modern software development. Tune in for more on the importance of observability as a spectrum, the evolving role of SREs, and advice to aspiring software engineers.

Mark as Played

Ben Treynor Sloss (VP of Engineering, Google) joins hosts Steve McGhee and Dr. Jennifer Petoff (Director of Technical Infrastructure Education, Google) to share the evolution of SRE and its impact on software development, how AI and ML significantly impacts SRE practices, and the future of SRE.

Ben coined the term "Site Reliability Engineering" for his team of (no...

Mark as Played

In this episode, Healfdene Goguen (Principal Engineer, Google) joins hosts Steve McGhee and Jordan Greenberg to discuss the vast amount of work to be done by SREs, and the fascinating challenges to tackle with clear real-world implications. It's a truly exciting time to be an SRE at Google!

Mark as Played

Popular Podcasts

    Welcome to Bookmarked by Reese’s Book Club — the podcast where great stories, bold women, and irresistible conversations collide! Hosted by award-winning journalist Danielle Robay, each week new episodes balance thoughtful literary insight with the fervor of buzzy book trends, pop culture and more. Bookmarked brings together celebrities, tastemakers, influencers and authors from Reese's Book Club and beyond to share stories that transcend the page. Pull up a chair. You’re not just listening — you’re part of the conversation.

    On Purpose with Jay Shetty

    I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

    Dateline NBC

    Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

    24/7 News: The Latest

    The latest news in 4 minutes updated every hour, every day.

    The Bobby Bones Show

    Listen to 'The Bobby Bones Show' by downloading the daily full replay.

Advertise With Us
Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.