All Episodes

August 16, 2025 108 mins
Today Lukas Petersson and Axel Backlund of Andon Labs join The Cognitive Revolution to discuss their experiments deploying autonomous AI agents to run real-world vending machines, exploring the safety challenges and unexpected behaviors that emerge when frontier models like Claude and Grok operate without human oversight. Read transcript of the episode here. Check out our sponsors: Oracle Cloud Infrastructure, Shopify. Shownotes below brought to you by Notion AI Meeting Notes - try one month for free at ⁠https://⁠⁠notion.com/lp/nathan Autonomous Organization Philosophy: Andon Labs believes that AI models will improve to the point where human oversight becomes impractical due to efficiency constraints, leading them to pursue fully autonomous systems rather than gradual automation. Vending Bench as a Testing Ground: They created "Vending Bench" as a benchmark for testing long-term coherence of autonomous agents, using vending machines as a practical business case for experimentation. Domain-Specific vs General AI: There's a notable difference between optimizing AI for narrow domains (like vending machines) versus general-purpose AI, with domain-specific applications potentially being more manageable regarding reward hacking. Frontier Model Race: Major companies like OpenAI and Google are advancing rapidly in general reasoning capabilities (e.g., IMO Gold achievements) independent of narrow application research. Insurance and Liability: The insurance industry may play a significant role in AI adoption, with premiums potentially being much higher for general models that could be misused versus narrow-domain models with limited capabilities. For-profit AI Safety: The case for for-profit companies in AI safety has been historically neglected but is becoming clearer, with accelerators like Seldon Labs supporting this approach. Sponsors: Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive Shopify: Shopify powers millions of businesses worldwide, handling 10% of U.S. e-commerce. With hundreds of templates, AI tools for product descriptions, and seamless marketing campaign creation, it's like having a design studio and marketing team in one. Start your $1/month trial today at https://shopify.com/cognitive PRODUCED BY: https://aipodcast.ing CHAPTERS: (00:00) About the Episode (04:49) Company Vision Overview (12:24) Vending Benchmark Design (Part 1) (20:12) Sponsor: Oracle Cloud Infrastructure (21:21) Vending Benchmark Design (Part 2) (24:41) Model Performance Results (Part 1) (35:03) Sponsor: Shopify (37:00) Model Performance Results (Part 2) (43:06) Real World Deployment (59:41) Wild Stories Incidents (01:19:59) Business Safety Strategy (01:38:20) Future Directions Discussion (01:47:09) Outro
Mark as Played

Advertise With Us

Popular Podcasts

Stuff You Should Know
My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder is a true crime comedy podcast hosted by Karen Kilgariff and Georgia Hardstark. Each week, Karen and Georgia share compelling true crimes and hometown stories from friends and listeners. Since MFM launched in January of 2016, Karen and Georgia have shared their lifelong interest in true crime and have covered stories of infamous serial killers like the Night Stalker, mysterious cold cases, captivating cults, incredible survivor stories and important events from history like the Tulsa race massacre of 1921. My Favorite Murder is part of the Exactly Right podcast network that provides a platform for bold, creative voices to bring to life provocative, entertaining and relatable stories for audiences everywhere. The Exactly Right roster of podcasts covers a variety of topics including historic true crime, comedic interviews and news, science, pop culture and more. Podcasts on the network include Buried Bones with Kate Winkler Dawson and Paul Holes, That's Messed Up: An SVU Podcast, This Podcast Will Kill You, Bananas and more.

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.