All Episodes

October 30, 2025 16 mins
This is the latest in a series of essays on AI Scaling.
You can find the others on my site.

Summary: RL-training for LLMs scales surprisingly poorly. Most of its gains are from allowing LLMs to productively use longer chains of thought, allowing them to think longer about a problem. There is some improvement for a fixed length of answer, but not enough to drive AI progress. Given the scaling up of pre-training compute also stalled, we'll see less AI progress via compute scaling than you might have thought, and more of it will come from inference scaling (which has different effects on the world). That lengthens timelines and affects strategies for AI governance and safety.



The current era of improving AI capabilities using reinforcement learning (from verifiable rewards) involves two key types of scaling:

  1. Scaling the amount of compute used for RL during training
  2. Scaling [...]


---

Outline:

(09:46) How do these compare to pre-training scaling?

(14:16) Conclusion

---

First published:
October 22nd, 2025

Source:
https://www.lesswrong.com/posts/xpj6KhDM9bJybdnEe/how-well-does-rl-scale

---



Narrated by TYPE III AUDIO.

---

Images from the article:

Bar graph titled
Graph comparing GPT-5 and OpenAI o3 accuracy on PhD science questions.
Graph comparing GPT-5 and OpenAI o3 software engineering performance across token lengths.
Arc AGI-1 leaderboard showing AI model performance versus cost per task.
Mark as Played

Advertise With Us

Popular Podcasts

Stuff You Should Know
Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.