Most AI safety conversations centre on alignment: ensuring AI systems share our values and goals. But despite progress, we’re unlikely to know we’ve solved the problem before the arrival of human-level and superhuman systems in as little as three years.
So some are developing a backup plan to safely deploy models we fear are actively scheming to harm us — so-called “AI control.” While this may sound mad, given the reluctance of AI companies to delay deploying anything they train, not developing such techniques is probably even crazier.
Today’s guest — Buck Shlegeris, CEO of Redwood Research — has spent the last few years developing control mechanisms, and for human-level systems they’re more plausible than you might think. He argues that given companies’ unwillingness to incur large costs for security, accepting the possibility of misalignment and designing robust safeguards might be one of our best remaining options.
Links to learn more, highlights, video, and full transcript.
As Buck puts it: "Five years ago I thought of misalignment risk from AIs as a really hard problem that you’d need some really galaxy-brained fundamental insights to resolve. Whereas now, to me the situation feels a lot more like we just really know a list of 40 things where, if you did them — none of which seem that hard — you’d probably be able to not have very much of your problem."
Of course, even if Buck is right, we still need to do those 40 things — which he points out we’re not on track for. And AI control agendas have their limitations: they aren’t likely to work once AI systems are much more capable than humans, since greatly superhuman AIs can probably work around whatever limitations we impose.
Still, AI control agendas seem to be gaining traction within AI safety. Buck and host Rob Wiblin discuss all of the above, plus:
Chapters:
This episode was originally recorded on February 21, 2025.
Video: Simon Monsour and Luke Monsour
Audio engineering: Ben Cordell, Milo McGuire, and Dominic Armstrong
Transcriptions and web: Katy Moore
CrimeLess: Hillbilly Heist
It’s 1996 in rural North Carolina, and an oddball crew makes history when they pull off America’s third largest cash heist. But it’s all downhill from there. Join host Johnny Knoxville as he unspools a wild and woolly tale about a group of regular ‘ol folks who risked it all for a chance at a better life. CrimeLess: Hillbilly Heist answers the question: what would you do with 17.3 million dollars? The answer includes diamond rings, mansions, velvet Elvis paintings, plus a run for the border, murder-for-hire-plots, and FBI busts.
Crime Junkie
Does hearing about a true crime case always leave you scouring the internet for the truth behind the story? Dive into your next mystery with Crime Junkie. Every Monday, join your host Ashley Flowers as she unravels all the details of infamous and underreported true crime cases with her best friend Brit Prawat. From cold cases to missing persons and heroes in our community who seek justice, Crime Junkie is your destination for theories and stories you won’t hear anywhere else. Whether you're a seasoned true crime enthusiast or new to the genre, you'll find yourself on the edge of your seat awaiting a new episode every Monday. If you can never get enough true crime... Congratulations, you’ve found your people. Follow to join a community of Crime Junkies! Crime Junkie is presented by audiochuck Media Company.
Stuff You Should Know
If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.