All Episodes

August 6, 2025 27 mins

In this episode, we discuss OpenAI's shift to open-source with the release of a 120 billion and a 20 billion parameter model for local use. Ejaaz and Josh highlight the democratization of AI access, enhanced privacy, and customization opportunities. 

We analyze the competitive landscape against major Chinese models and hint at the anticipated GPT-5 release. Tune in for insights into this transformative moment in AI!

------
🌌 LIMITLESS HQ: LISTEN & FOLLOW HERE ⬇️
https://limitless.bankless.com/
https://x.com/LimitlessFT

------
TIMESTAMPS

0:00 OpenAI's Surprising Release
1:30 The Power of Open Source Models
3:28 Local Computing Revolution
5:33 Privacy and Personalization
6:54 The Impact on Industries
9:32 Testing the New Models
17:43 Competing with Chinese Models
24:06 The Future of AI Technology
26:29 Anticipating GPT-5

------
RESOURCES

Josh: https://x.com/Josh_Kale

Ejaaz: https://x.com/cryptopunk7213

------
Not financial or tax advice. See our investment disclosures here:
https://www.bankless.com/disclosures⁠

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Josh: The unthinkable has just happened open ai (00:03):
undefined

Josh: has released an open source model open ai (00:06):
undefined

Josh: has been closed ai since the time that i knew them (00:09):
undefined

Josh: they have been named themselves open ai they were not (00:12):
undefined

Josh: open source they have finally released an open source model and surprise surprise (00:15):
undefined

Josh: it's actually really great and i think the downstream implications of an open (00:18):
undefined

Josh: source model from a company like this that is this good are really it's a really (00:22):
undefined

Josh: big deal i think this really matters a lot just yesterday they announced the release of GPT-OSS. (00:26):
undefined

Josh: There are two models. There's a 120 billion parameter model and there's a 20 (00:32):
undefined

Josh: billion parameter model. We're going to get into benchmarks. (00:36):
undefined

Josh: We're going to get into how good they are. (00:38):
undefined

Josh: But the idea is that OpenAI has actually released an open source model. (00:40):
undefined

Josh: And this can compare to the Chinese models because we recently had DeepSeek and we've had Kimi. (00:44):
undefined

Josh: And those would be very good. But this is the first really solid American-based open source model. (00:49):
undefined

Josh: So Ijaz, I know you've been kind of digging in the weeds about how this works. (00:54):
undefined

Josh: Can you explain us exactly why this is a big deal why this happened what's going on here (00:57):
undefined

Ejaaz: Yeah it's it's pretty huge so so here (01:00):
undefined

Ejaaz: are the hot highlights um as you mentioned there's two (01:03):
undefined

Ejaaz: models that came out the 20 billion parameter model which is actually small (01:06):
undefined

Ejaaz: enough to run on your mobile phone right now and they have a 120 billion parameter (01:09):
undefined

Ejaaz: model which is big but still small enough to run on a high performance laptop (01:14):
undefined

Ejaaz: so if you guys have a macbook out there jump in go for it um it's fully customizable. (01:19):
undefined

Ejaaz: So remember, open source means (01:25):
undefined

Ejaaz: that you can literally have access to the design of the entire model. (01:26):
undefined

Ejaaz: It's like OpenAI giving away their secret recipe to how their frontier models (01:30):
undefined

Ejaaz: work. And you can kind of like recreate it at home. (01:34):
undefined

Ejaaz: This means that you can customize it to any kind of use case that you want, (01:37):
undefined

Ejaaz: give it access to all your personal hard drives, tools, data, (01:41):
undefined

Ejaaz: and it can do wonderful stuff. (01:44):
undefined

Ejaaz: But Josh, here's the amazing part. (01:46):
undefined

Ejaaz: On paper, these models are as good as GPT-4 mini models, which is, (01:48):
undefined

Ejaaz: it's pretty impressive, right? (01:54):
undefined

Ejaaz: In practice and i've been playing around with it for the last few hours they're (01:56):
undefined

Ejaaz: as good in my opinion and actually quicker than (02:00):
undefined

Ejaaz: gpt-03 which is their frontier model and i (02:03):
undefined

Ejaaz: mean this across like everything so (02:06):
undefined

Ejaaz: reasoning um it spits out answers super quickly and i can see its reasoning (02:09):
undefined

Ejaaz: it happens in like a couple of seconds and i'm so used to waiting like 30 seconds (02:14):
undefined

Ejaaz: to a couple minutes on gpt-03 josh so it's pretty impressive and an insane unlock (02:18):
undefined

Ejaaz: on coding it's as good and on creativity as well. (02:23):
undefined

Ejaaz: So I'm my mind's pretty blown at all of this, right? Josh, what do you what do you think? (02:28):
undefined

Josh: Yeah, so here's why it's impressive to me is because a lot of the times I don't (02:32):
undefined

Josh: really care to use the outer bands of what a model is capable of. (02:35):
undefined

Josh: Like I am not doing deep PhD level research. I'm not solving these math Olympiad questions. (02:38):
undefined

Josh: I'm just trying to ask it a few normal questions and get some answers. (02:43):
undefined

Josh: And what these models do is an excellent job at serving that need. (02:46):
undefined

Josh: They're not going to go out and solve the world's hardest problems, (02:49):
undefined

Josh: but neither do I. I don't want to solve those problems. (02:51):
undefined

Josh: I just kind of want the information that I want, whether it be just a normal (02:54):
undefined

Josh: Google type search or whether it be asking it some miscellaneous question about (02:57):
undefined

Josh: some work that I'm doing. (03:01):
undefined

Josh: It's really good at answering that. So I think initial impressions, (03:03):
undefined

Josh: because they did allow you to test it publicly through their website, (03:07):
undefined

Josh: it's just really good at the things that I want. (03:10):
undefined

Josh: So the fact that I can run one of these models on a local device on my iPhone, (03:12):
undefined

Josh: well, it feels like we're reaching this place that AI is starting to become (03:16):
undefined

Josh: really interesting because for so long we've had compute handled fully on the (03:19):
undefined

Josh: cloud and now this is the first time where (03:23):
undefined

Josh: Compute can really happen on your computer. It could happen on your laptop. (03:26):
undefined

Josh: I could download the model and I could actually store the model, (03:28):
undefined

Josh: the 120 billion parameter model on a 56 gigabyte USB drive. (03:31):
undefined

Josh: So you can take the collective knowledge of the world and put it on a tiny little USB drive. (03:35):
undefined

Josh: And granted, it needs a bit of a bigger machine to actually run those parameters, (03:40):
undefined

Josh: but you can install all the weights. It's 56 gigabytes. (03:43):
undefined

Josh: It's this incredibly powerful package. And it probably, I don't know if this (03:46):
undefined

Josh: is true, but it's probably the most condensed knowledge base in the history of humanity. (03:51):
undefined

Josh: They've really managed to take a tremendous amount of tokens, (03:55):
undefined

Josh: smush them into this little parameter set, and then publish it for people to (03:59):
undefined

Josh: use. So for me, I'm really excited. (04:02):
undefined

Josh: I like having my own mini portable models. I am excited to download, (04:04):
undefined

Josh: try it out, run it on my MacBook. (04:08):
undefined

Josh: I'm not sure I could run the 120 billion parameter model, but at least the 20B (04:09):
undefined

Josh: and give it a shot and see how it works. (04:12):
undefined

Ejaaz: You need to get the latest MacBook, Josh. I know, I got to upgrade. We can test that out. (04:14):
undefined

Ejaaz: What I also love about it is it's fully private, right? So you can give it access (04:19):
undefined

Ejaaz: to your personal hard drive, your Apple Notes, whatever you store on your computer, basically. (04:24):
undefined

Ejaaz: And you can basically instruct the model to use those different tools. (04:31):
undefined

Ejaaz: So one review that I keep seeing from a number of people who have been testing (04:36):
undefined

Ejaaz: it so far is that it's incredibly great and intuitive at tool use. (04:39):
undefined

Ejaaz: And the reason why this is such a big deal is a lot of the Frontier models right (04:44):
undefined

Ejaaz: now, when they allow you to give access to different tools, they're kind of clunky. (04:48):
undefined

Ejaaz: The model doesn't actually know when to use a specific tool and when not to. (04:53):
undefined

Ejaaz: But these models are super intuitive, which is great. The privacy thing is also (04:56):
undefined

Ejaaz: a big thing because you kind of (05:00):
undefined

Ejaaz: don't want to be giving all your personal information away to Sam Altman. (05:02):
undefined

Ejaaz: But you want a highly personalized model. (05:05):
undefined

Ejaaz: And I think if I was to condense this entire model release in a single sentence, (05:08):
undefined

Ejaaz: Joss, I think I would say it is the epitome of privacy and personalization in an AI model so far. (05:13):
undefined

Ejaaz: It is that good. it is swift it is cheap and I'm going to replace it completely (05:20):
undefined

Ejaaz: with all my GPT-4.0 queries as you said earlier like, (05:26):
undefined

Ejaaz: Who needs to use the basic models anymore when you have access to this? (05:30):
undefined

Josh: Yeah. So it's funny you say that you're going to swap it because I don't think I'm going to swap it. (05:34):
undefined

Josh: I still am not sure I personally have a use case right now because I love the (05:38):
undefined

Josh: context. I want the memory. (05:41):
undefined

Josh: I like having it all server side where it kind of knows everything about me. (05:42):
undefined

Josh: I guess in the case that I wanted to really make it a more intimate model experience (05:46):
undefined

Josh: where you want to sync it up with like journal entries or your camera roll or (05:50):
undefined

Josh: whatever, whatever interesting like personal things, this would be a really cool use case. (05:55):
undefined

Josh: I think for the people who are curious why this matters to them, (05:59):
undefined

Josh: well, we could talk a little briefly about like the second order effects of (06:03):
undefined

Josh: having open source models as powerful, because what that allows you to do is (06:07):
undefined

Josh: to serve queries from a local machine. (06:10):
undefined

Josh: So if you are using an app or let's say you're an app developer and you're building (06:13):
undefined

Josh: an application and your app is serving millions of requests because it's a GPT wrapper. (06:17):
undefined

Josh: Well, what you could do now is instead of paying API calls to the OpenAI server, (06:22):
undefined

Josh: you can actually just run your own local server, use this model, (06:26):
undefined

Josh: and then serve all that data for the cost of the electricity. (06:29):
undefined

Josh: And that's a really big unlock for the amount of compute that's going to be (06:31):
undefined

Josh: available for not only developers, but for the cost of the users in a lot of these applications. (06:34):
undefined

Josh: So for the applications that aren't doing this crazy moon math and that are (06:38):
undefined

Josh: just kind of serving basic queries all day long, this like really significantly drops the cost. (06:42):
undefined

Josh: It increases the privacy, like you mentioned. And there's a ton of really important (06:47):
undefined

Josh: upsides to open source models that we just haven't seen up until now. (06:51):
undefined

Josh: And I'm very excited to see come forward. (06:54):
undefined

Ejaaz: Well, Josh, the thing with most of these open source models, (06:58):
undefined

Ejaaz: we spoke about actually two major Chinese open source models that were released last week. (07:01):
undefined

Ejaaz: It's not accessible to everyone. Like you and me aren't necessarily going to (07:06):
undefined

Ejaaz: go to Hugging Face, a completely separate website, download these models, (07:10):
undefined

Ejaaz: run the command line interface. (07:14):
undefined

Ejaaz: Most of the listeners on the show doesn't even know what that means. (07:16):
undefined

Ejaaz: I don't even know if I know what that means, right? (07:18):
undefined

Ejaaz: But here you have a lovely created website where you could just kind of log (07:20):
undefined

Ejaaz: on and play around with these open source models. And that's exactly what I've been doing. (07:25):
undefined

Ejaaz: I actually have a few kind of demo queries that I ran yesterday, Josh. (07:28):
undefined

Josh: Yeah, walk us through, let's see. (07:34):
undefined

Ejaaz: Okay, so there's an incredibly complex test, which a lot of these AI models, (07:35):
undefined

Ejaaz: which cost hundreds of billions of dollars to train, can't quite answer. (07:41):
undefined

Ejaaz: And that is how many R's, the letter R's are there in the word strawberry? Most say two. (07:45):
undefined

Josh: The bar's on the floor, (07:52):
undefined

Ejaaz: Huh? Yeah, if we were to go with most models, they say two. They're convinced that they are only two. (07:53):
undefined

Ejaaz: And I ran that test today, rather yesterday, with these open source models, (07:58):
undefined

Ejaaz: and it correctly guessed three, Josh. So we're one for one right now. (08:03):
undefined

Josh: We're on our way. (08:07):
undefined

Ejaaz: But then I was like, okay, we live in New York City. I love this place. (08:08):
undefined

Ejaaz: I'm feeling a little poetic today. Can you write me a sonnet? (08:12):
undefined

Ejaaz: And my goal with this wasn't to test whether it could just write a poem. (08:16):
undefined

Ejaaz: It was to test how quickly it could figure it out. (08:19):
undefined

Ejaaz: And as you see it thought for a couple of seconds on this so it literally spat (08:22):
undefined

Ejaaz: this out in two seconds um and it was structured really well you know it kind (08:26):
undefined

Ejaaz: of flowed would i be you know reciting this out loud to the public no but you (08:30):
undefined

Ejaaz: know i was pretty impressed. (08:34):
undefined

Ejaaz: And then, Josh, I was thinking, you know, what's so unique about open source models? (08:35):
undefined

Ejaaz: You just went through a really good list of why open source models work. (08:40):
undefined

Ejaaz: But I was curious as to why these specific open source models were better than (08:44):
undefined

Ejaaz: other open source models or maybe even other centralized models. (08:48):
undefined

Ejaaz: So I wrote a query. I decided to ask it. I was like, you know, (08:51):
undefined

Ejaaz: tell me some things that you could do that are the larger centralized models. (08:54):
undefined

Ejaaz: And I spat out a really good list. I'm not going to go through all of them, (08:57):
undefined

Ejaaz: but, you know, some of the things that we've highlighted so far, you can fine tune it. (09:00):
undefined

Ejaaz: It's privacy. See, I really like this point that it made, Josh, (09:03):
undefined

Ejaaz: that it just shows that AI is probably getting smarter than us, (09:06):
undefined

Ejaaz: which is you can custom inject your own data into these models. (09:08):
undefined

Ejaaz: Now, without kind of digging deeper into this, when you use a centralized model, (09:13):
undefined

Ejaaz: it's already pre-trained on a bunch of data that companies like Anthropic and (09:18):
undefined

Ejaaz: Google have already fed it. (09:23):
undefined

Ejaaz: And so it's kind of formed its own personality, right? (09:25):
undefined

Ejaaz: So you can't change the model's personality on a centralized model. (09:29):
undefined

Ejaaz: But with an open model you have full reign to do whatever you want and so if (09:32):
undefined

Ejaaz: you were feeling kind of uh adventurous you could use your own data and make (09:37):
undefined

Ejaaz: it super personal and customizable so i thought that was really cool and fun (09:41):
undefined

Ejaaz: demo josh have you been playing around with this. (09:44):
undefined

Josh: Yeah it's um it's it's smart it's fun it's smart i wouldn't say it's anything (09:46):
undefined

Josh: novel the like query results that i get are you know on par with everything (09:50):
undefined

Josh: else i don't notice the difference which is good because it means they're performing (09:54):
undefined

Josh: very well it's not like i feel like i'm getting degraded performance because (09:57):
undefined

Josh: I'm using a smaller model. (10:01):
undefined

Josh: But it's just like it's nothing too different, I would say. (10:02):
undefined

Josh: The differences, I mean, again, all this boils down to the differences of it (10:06):
undefined

Josh: being open source versus being (10:09):
undefined

Ejaaz: Run on the server. Well, let me challenge you that, right? OK, (10:10):
undefined

Ejaaz: so you're saying it's good but nothing novel. (10:13):
undefined

Ejaaz: Would you say it's as good as GPT-4.0, (10:16):
undefined

Ejaaz: minus the memory let's just put memory aside for a second would you use it if (10:20):
undefined

Ejaaz: it had memory capability. (10:24):
undefined

Josh: Actually no probably not um i still wouldn't (10:26):
undefined

Josh: because i love my desktop application too much i (10:29):
undefined

Josh: love my mobile app too much and i like that the conversations are (10:32):
undefined

Josh: shared in the cloud um so i can use them on my phone i could (10:35):
undefined

Josh: start on my laptop and go back and forth so even in (10:38):
undefined

Josh: that case i'm probably still not a user um because (10:41):
undefined

Josh: the convenience factor but there are there are a (10:44):
undefined

Josh: lot of people and a lot of industries that would be and this is actually something probably (10:47):
undefined

Josh: worth surfacing is the new industries that are now able to (10:50):
undefined

Josh: benefit from this because a lot of industries have (10:52):
undefined

Josh: a tough time using these AI models because (10:55):
undefined

Josh: of the data privacy concerns particularly I mean if you think about a (10:58):
undefined

Josh: healthcare industry people who are dealing with patients data it's (11:01):
undefined

Josh: very challenging for them to fork it over to open AI and just trust that they're (11:04):
undefined

Josh: going to keep it safe so what this does is it actually allows companies that (11:07):
undefined

Josh: are in like the healthcare industry the finance industry who's dealing with (11:11):
undefined

Josh: very like high touch personal finance the legal industry who's dealing with (11:13):
undefined

Josh: a lot of legality government and defense a lot of these industries that were (11:16):
undefined

Josh: not previously able to use these popular AI models, (11:20):
undefined

Josh: well, now they have a pretty good model that they could run locally on their machines. (11:24):
undefined

Josh: And that doesn't have any possibility of actually leaking out their customer (11:27):
undefined

Josh: data, leaking out financials or healthcare data or, or like any sort of legal documents. (11:31):
undefined

Josh: And, and that feels like a super powerful unlock. So for them, (11:35):
undefined

Josh: it feels like a no brainer, obviously get the 120 B model running on a local (11:38):
undefined

Josh: machine inside of your office, and you can load it up with all this context. (11:43):
undefined

Josh: And that seems to be who this would be most impacting, right? (11:46):
undefined

Ejaaz: But still to that point, I wonder how many of these companies can be bothered (11:51):
undefined

Ejaaz: to do that themselves and run their own internal kind of like infrastructure. (11:56):
undefined

Ejaaz: I'm thinking about OpenAI, who cracked, I think, $10 billion in annual recurring (12:01):
undefined

Ejaaz: revenue this week, which is like a major milestone. (12:06):
undefined

Ejaaz: And a good chunk of that, I think 33% of that is for enterprise customers. (12:09):
undefined

Ejaaz: And to your point, like these enterprise customers don't wanna be giving open (12:14):
undefined

Ejaaz: AI their entire data. You know, they can be used to train other AI models. (12:18):
undefined

Ejaaz: So their fix or solution right now is they use kind of like private cloud instances, (12:21):
undefined

Ejaaz: that I think are supplied by Microsoft by their Azure cloud service or something like that. (12:28):
undefined

Ejaaz: And I wonder if they chose that, (12:32):
undefined

Ejaaz: One, because there wasn't any open source models available or because they kind (12:35):
undefined

Ejaaz: of just want to offload that to Microsoft to deal with. (12:40):
undefined

Ejaaz: My gut tells me they're going to want to go with the latter, (12:42):
undefined

Ejaaz: which is like, you know, just give it to some kind of cloud provider to deal with themselves. (12:46):
undefined

Ejaaz: And they just trust Microsoft because it's a big brand name. (12:50):
undefined

Ejaaz: But yeah, I don't really know how they'll materialize. I still think, (12:52):
undefined

Ejaaz: and maybe this is because of my experience in crypto, Josh, that the open source (12:55):
undefined

Ejaaz: models are still for like people that are at the fringe that are really experimenting (13:00):
undefined

Ejaaz: with these things. but maybe don't have billions of dollars. (13:04):
undefined

Josh: Yeah, that could be right. It'll be interesting to see how it plays out on all (13:07):
undefined

Josh: scale of businesses because I mean, as a, like I think of a lot of indie devs (13:10):
undefined

Josh: that I follow on Twitter and I see them all the time (13:15):
undefined

Josh: just running local servers and they just, if they had this local model that (13:17):
undefined

Josh: they could run on their machine and it takes the cost per query down from like (13:20):
undefined

Josh: a penny to zero, that's like a big zero to one change. (13:24):
undefined

Josh: So he does this model special because there are also a number of breakthroughs (13:28):
undefined

Josh: that occurred in order to make this possible, (13:32):
undefined

Josh: in order to condense this knowledge to be so tight so here's this (13:34):
undefined

Josh: tweet from the professor talking about the cool tech tweaks in (13:37):
undefined

Josh: this new model and what open ai was able to achieve some of (13:40):
undefined

Josh: these i believe are novel some of these are seen before um if (13:43):
undefined

Josh: you look at point two mixture of experts we're familiar with mixture of experts (13:46):
undefined

Josh: we've seen other companies use that like kimmy and deep (13:49):
undefined

Josh: seek basically instead of one brain doing everything the ai (13:52):
undefined

Josh: has this team of experts that are kind of like mini brains (13:54):
undefined

Josh: and specialize in different tasks it picks the right expert for (13:57):
undefined

Josh: the job and it makes it faster so like instead of (14:00):
undefined

Josh: having the entire 120 million parameter model search for one question maybe (14:03):
undefined

Josh: you just take a couple million of those parameters that are really good at solving (14:08):
undefined

Josh: math problems and they use it and that that's what brings compute down the first (14:11):
undefined

Josh: point is this thing called the sliding window attention so if you imagine an (14:15):
undefined

Josh: ai is like reading a really long book (14:18):
undefined

Josh: It can only focus on a few pages at a time this trick (14:21):
undefined

Josh: kind of lets it slide its focus window along the text so (14:24):
undefined

Josh: when you think of a context window generally it's fixed right where you can see (14:27):
undefined

Josh: a fixed set of data this sliding window (14:30):
undefined

Josh: attention allows you to kind of move that context back and forth a (14:33):
undefined

Josh: little bit so it takes what would have normally been (14:36):
undefined

Josh: a narrow context window and extends it out a little bit to (14:39):
undefined

Josh: the side so you get a little bit more context which is great for a (14:42):
undefined

Josh: smaller model again you really want to consider that all of these are (14:44):
undefined

Josh: are optimized for this microscopic scale that (14:47):
undefined

Josh: can literally run on your phone and then the third point is this (14:50):
undefined

Josh: thing called rope with yarn which sounds like a cat toy but this (14:52):
undefined

Josh: is how the ai keeps track of the order of words so like the position (14:56):
undefined

Josh: of the words in a sentence um so rope (14:59):
undefined

Josh: you could imagine it like like the twisty math way to do (15:02):
undefined

Josh: it and yarn makes it stretch further for really long stuff (15:05):
undefined

Josh: so we have the context window that is (15:07):
undefined

Josh: sliding we have this rope with yarn that allows you (15:10):
undefined

Josh: to just kind of like stretch the words a little bit further and (15:13):
undefined

Josh: then we have attention sinks which is the last one which is (15:16):
undefined

Josh: there's a problem when ai is dealing with these endless chats that (15:19):
undefined

Josh: lets it it kind of sinks in or ignores the boring old (15:22):
undefined

Josh: info so it can pay attention to the new stuff so basically what it (15:25):
undefined

Josh: is is if you're having a long chat with it and it determines hey this stuff (15:28):
undefined

Josh: is kind of boring i don't need to remember it it'll actually just throw it away (15:32):
undefined

Josh: and it'll increase that context window a little bit so again hyper optimizing (15:35):
undefined

Josh: for for the small context window that it has and those are kind of the key four (15:39):
undefined

Josh: breakthroughs that made this special again i'm not sure any of them are particularly novel, (15:42):
undefined

Josh: But when combined together, that's what allows you to get these 04 mini results (15:47):
undefined

Josh: or even 03 results on the larger model on something that can run locally on your laptop. (15:52):
undefined

Josh: So it's a pretty interesting set of breakthroughs. I think a lot of times OpenAI, (15:56):
undefined

Josh: we talk about them because of their feature breakthroughs, not really their (16:01):
undefined

Josh: technical breakthroughs. (16:04):
undefined

Josh: I think a lot of times the technical breakthroughs are reserved for like the (16:05):
undefined

Josh: Kimi models or the DeepSeq models (16:08):
undefined

Josh: where they really kind of break open the barrier of what's possible. (16:09):
undefined

Josh: But I don't want to discredit OpenAI because these are pretty interesting things (16:12):
undefined

Josh: that they've managed to combine together into this like one cohesive, (16:16):
undefined

Josh: tiny little model, and then just gave it away. (16:19):
undefined

Ejaaz: Yeah. I mean, they actually have a history of front-running open source frontier breakthroughs. (16:22):
undefined

Ejaaz: If you remember when DeepSeek got deployed, Josh, one of their primary training (16:29):
undefined

Ejaaz: methods was reinforcement learning, which was pioneered by an open AI researcher, (16:34):
undefined

Ejaaz: which who probably like now works at Meta. (16:38):
undefined

Ejaaz: Yeah, and I was I was I was looking at the feature that you mentioned just not (16:40):
undefined

Ejaaz: the feature, but the breakthrough sliding window attention, and you mentioned (16:46):
undefined

Ejaaz: that it can basically toggle reasoning. (16:48):
undefined

Ejaaz: And I was pleasantly surprised to just notice that on the actual interface of (16:51):
undefined

Ejaaz: the models here, Josh, can you see over here? (16:54):
undefined

Ejaaz: You can toggle between reasoning levels of high, medium and low. (16:57):
undefined

Ejaaz: So depending on what your prompt or query is, if it is kind of like a low level (17:01):
undefined

Ejaaz: query where you're like hey just record this shopping or grocery list you know (17:04):
undefined

Ejaaz: that's probably like a medium or a low query so oh it's pretty cool to to see (17:09):
undefined

Ejaaz: that surface to the user like see it actively being used. (17:12):
undefined

Josh: Yeah, no, super cool. I think I like the fine tuning of it. (17:15):
undefined

Josh: And again, allowing you to kind of choose your intelligence levels, (17:19):
undefined

Josh: because I imagine a lot of average people just don't, a lot of average queries (17:22):
undefined

Josh: just don't need that much compute. (17:25):
undefined

Josh: So if you can toggle it for the low reasoning level and get your answers, (17:27):
undefined

Josh: that that's amazing. Super fast, super cheap. (17:30):
undefined

Ejaaz: Did you see that trending tweet earlier this week, Josh, which basically said (17:32):
undefined

Ejaaz: that the majority of ChatGPT users have never used a different model than ChatGPT 4.0? (17:37):
undefined

Josh: I haven't seen it, but that makes sense. (17:42):
undefined

Ejaaz: Yeah i i feel like the bulk of people i was chatting to (17:43):
undefined

Ejaaz: my sister yesterday and she was kind of (17:46):
undefined

Ejaaz: like using it for some research project at work and the (17:49):
undefined

Ejaaz: screenshot she sent me over was foro and i was like hey you know like (17:52):
undefined

Ejaaz: you could just run this on like a model that's like (17:55):
undefined

Ejaaz: five times better than this right uh we'll come (17:58):
undefined

Ejaaz: up with a much more creative set of ideas so just made me think that (18:01):
undefined

Ejaaz: like i don't know how many people like care that they are like (18:04):
undefined

Ejaaz: these brand new novel models and maybe um you know (18:06):
undefined

Ejaaz: this kind of like basic model is good enough for everyone i don't know (18:09):
undefined

Ejaaz: but um but moving on josh um there (18:12):
undefined

Ejaaz: was a big question that popped into my head as (18:15):
undefined

Ejaaz: soon as these models released which was are they as good (18:18):
undefined

Ejaaz: as the chinese open source models right i wanted (18:21):
undefined

Ejaaz: to get some opinions from people and and the reason (18:24):
undefined

Ejaaz: why this matters i'm just give the listeners some context (18:26):
undefined

Ejaaz: is china has been the number one (18:29):
undefined

Ejaaz: nation to put out the best open source (18:32):
undefined

Ejaaz: models over the last 12 months it started with deep seek (18:35):
undefined

Ejaaz: and then alibaba's quen models got involved (18:38):
undefined

Ejaaz: and then recently we had kimmy k2 and i think (18:42):
undefined

Ejaaz: there was another ai lab out of china which came out so they (18:44):
undefined

Ejaaz: have outside of america the highest density. (18:47):
undefined

Ejaaz: Of the top ai researchers they all come out of this one university (18:51):
undefined

Ejaaz: zinghua i believe they kind of like partially work (18:54):
undefined

Ejaaz: or train in the u.s as well so they've got this like kind of hybrid ai (18:57):
undefined

Ejaaz: mentality of how to build these models and they come up with a lot of these (19:00):
undefined

Ejaaz: frontier breakthroughs um kimmy k2 for context had uh one trillion parameters (19:03):
undefined

Ejaaz: in their model right comparing this to like 120 billion and 20 billion parameters (19:09):
undefined

Ejaaz: models from open air i was curious like does this beat them to the punch some people josh. (19:14):
undefined

Ejaaz: Don't think so okay this guy jason lee (19:20):
undefined

Ejaaz: he asks uh is the gpt oss stronger (19:23):
undefined

Ejaaz: than quen or kimmy or chinese open models and then (19:27):
undefined

Ejaaz: he later kind of quote tweets that tweet and says answer the model is complete (19:30):
undefined

Ejaaz: junk it's a hallucination machine overfit to reasoning benchmarks and has absolutely (19:35):
undefined

Ejaaz: zero recall ability so a few things he's mentioning here is one it hallucinates (19:40):
undefined

Ejaaz: a lot so it kind of makes up jargon terms, (19:45):
undefined

Ejaaz: ideas, or parameters that didn't really exist before. (19:48):
undefined

Ejaaz: Number two, he's saying that OpenAI designed this model purely so that it will (19:51):
undefined

Ejaaz: do well on the exams, which are the benchmarks that rate how these models compare to each other. (19:55):
undefined

Ejaaz: So they're saying that OpenAI optimized the model to kind of like do really (20:01):
undefined

Ejaaz: well at those tests, but actually fail at everything else, which is what people want to use it for. (20:06):
undefined

Ejaaz: And the final point that he makes is that it has zero recall ability, (20:10):
undefined

Ejaaz: which is something you mentioned earlier, Josh, which says it doesn't have memory (20:13):
undefined

Ejaaz: or context so you can have a conversation and then open up another conversation (20:16):
undefined

Ejaaz: and it's completely forgotten about the context that it has for you from that (20:20):
undefined

Ejaaz: initial conversation okay. (20:23):
undefined

Josh: So not not the best not to be unfair to open ai but it feels like they delayed (20:25):
undefined

Josh: this model a good bit of times oh yeah and they wanted it to look good and it (20:30):
undefined

Josh: intuitively makes sense to me that they would be kind of optimizing for benchmarks (20:34):
undefined

Josh: with this one um but nonetheless it's still impressive i'm seeing this big wall (20:38):
undefined

Josh: of text now what is what is this what is this post here (20:41):
undefined

Ejaaz: Well it's this post from uh one of these accounts i follow and they have an (20:44):
undefined

Ejaaz: interesting section here which says comparison to other open weights oh sick. (20:48):
undefined

Josh: Yeah what is this (20:52):
undefined

Ejaaz: So he goes while the larger gpt oss (20:53):
undefined

Ejaaz: 120 billion parameter model does not come (20:56):
undefined

Ejaaz: in above deep seek r1 so he's saying that deep seek r1 (20:59):
undefined

Ejaaz: just beats it out the park it is notable that (21:02):
undefined

Ejaaz: it is significantly smaller in both total and active (21:05):
undefined

Ejaaz: parameters than both of those models deep seek (21:08):
undefined

Ejaaz: r1 has 671 billion total parameters and (21:11):
undefined

Ejaaz: 37 billion active parameters and is released natively right which makes it 10x (21:14):
undefined

Ejaaz: larger than gpt's 120 billion parameter models but what he's saying is even (21:20):
undefined

Ejaaz: though gpt's model is smaller and doesn't perform as well as deep seek it's (21:24):
undefined

Ejaaz: still mightily impressive for its size. (21:29):
undefined

Josh: Okay that's cool because that gets back to the point we made earlier in the (21:32):
undefined

Josh: show that this is probably the most densely condensed (21:35):
undefined

Josh: however you want to say it like base of (21:41):
undefined

Josh: knowledge in the world they've used a lot of efficiency gains (21:43):
undefined

Josh: to squeeze the most out of it so in this small model (21:46):
undefined

Josh: it is i guess if we're optimizing maybe we (21:49):
undefined

Josh: can make up a metric here on the show which is like um output per (21:52):
undefined

Josh: per parameter or something like that like based on the total parameter (21:56):
undefined

Josh: count of this model it gives you the best value per (21:59):
undefined

Josh: token and that seems to be where this falls (22:03):
undefined

Josh: in line where it's not going to blow any other open source model out of the (22:06):
undefined

Josh: water but in terms of its size the fact that we can (22:08):
undefined

Josh: take a phone and literally run one of these models on a phone and (22:11):
undefined

Josh: you could go anywhere in the world with no service and have access to these models running (22:14):
undefined

Josh: on a laptop or whatever mobile device that that's super (22:17):
undefined

Josh: powerful and that's not something that is easy to do with the other open source (22:20):
undefined

Josh: models so perhaps that's the advantage that open ai has it's just the density (22:24):
undefined

Josh: of intelligence and the efficiency of these parameters that they've given to (22:28):
undefined

Josh: us versus just being this like home run open source model that is going for the frontier, (22:32):
undefined

Josh: it's just a little bit of a different approach. (22:37):
undefined

Ejaaz: Yeah, we need like a small but mighty ranking on this show, Josh, (22:40):
undefined

Ejaaz: that we can kind of like run every week when these companies release a new model. (22:44):
undefined

Ejaaz: No, but it got me thinking, if we zoomed out of that question, (22:48):
undefined

Ejaaz: right, because we're talking about small models versus large models, (22:52):
undefined

Ejaaz: parameters and how effectively they use versus other models that are bigger. (22:56):
undefined

Ejaaz: What really matters in this, Josh? In my opinion, it's user experience and how (23:01):
undefined

Ejaaz: useful these models are to my daily life, right? (23:06):
undefined

Ejaaz: At the end of the day, I kind of don't really care what size that model is unless (23:09):
undefined

Ejaaz: it's useful for me, right? It could be small, it could be personal, it could be private. (23:14):
undefined

Ejaaz: It depends on, I guess, the use case at the time. And I have a feeling that (23:17):
undefined

Ejaaz: the trend of how technology typically goes, you kind of want a really high-performant (23:22):
undefined

Ejaaz: small model, eventually. (23:29):
undefined

Ejaaz: Right? I try and think about like us using computers for the first time, (23:31):
undefined

Ejaaz: you know, back in our dinosaur age. (23:35):
undefined

Ejaaz: And then, you know, it all being condensed on a tiny metal slab that we now (23:37):
undefined

Ejaaz: use every day. And we can pretty much work from remotely from wherever. (23:42):
undefined

Ejaaz: And I feel like this is where models are going to go. They're going to become (23:45):
undefined

Ejaaz: more private. They're going to become more personal. (23:47):
undefined

Ejaaz: Maybe it'll be a combination of, you know, it running locally on your device (23:49):
undefined

Ejaaz: versus cloud inference and trusting certain providers. (23:53):
undefined

Ejaaz: I don't know how it's going to fall out, but I think Like it's not a zero to (23:57):
undefined

Ejaaz: one. It's not a black or white situation. (24:01):
undefined

Ejaaz: I don't think everyone's just going to go with large centralized models that (24:03):
undefined

Ejaaz: they can inference from the cloud. (24:06):
undefined

Ejaaz: I think it'll be a mixture of both. And how that materializes, (24:07):
undefined

Ejaaz: I don't know, but it's an interesting one to ponder. (24:09):
undefined

Josh: Yeah, I think this is funny. This is going to sound very ironic, (24:12):
undefined

Josh: but Apple was the person that got this most right. (24:15):
undefined

Ejaaz: Sorry, who's Apple again? (24:19):
undefined

Josh: Yeah, right. I mean, it sounds ridiculous to say this. And granted, (24:21):
undefined

Josh: they did not execute on this at all. (24:24):
undefined

Josh: But in theory, I think they nailed the approach initially, (24:25):
undefined

Josh: which was you run local compute where all of (24:28):
undefined

Josh: your stuff is so my iphone is the device i never (24:31):
undefined

Josh: leave without it is everything about me it is all of my messages my (24:34):
undefined

Josh: contacts all the contacts you could ever want from me and then the idea was (24:37):
undefined

Josh: they would give you a local model that is integrated and embedded into that (24:41):
undefined

Josh: operating system and then if there's anything that requires more compute well (24:44):
undefined

Josh: then they'll send the query off into the cloud but most of it will get done (24:47):
undefined

Josh: on your local device because most of it isn't that complicated and i think as (24:50):
undefined

Josh: a user when i ask myself what i want from AI. (24:54):
undefined

Josh: Well, I just want it to be my ultimate assistant. I just want it to be there (24:57):
undefined

Josh: to make my life better. And so much of that is the context. (25:00):
undefined

Josh: And Apple going with that model would have been incredible. (25:03):
undefined

Josh: It would have been so great. It would have had the lightweight model that runs (25:08):
undefined

Josh: locally, it has all the context of your life, and then it offloads to the cloud. (25:10):
undefined

Josh: I still think this model is probably the correct one for optimizing the user (25:13):
undefined

Josh: experience. But unfortunately, Apple just has not done that. (25:17):
undefined

Josh: So it's up for grabs. I mean, again, Sam Altman's been posting a lot this week, (25:20):
undefined

Josh: we do have to tease what's coming because this is probably going to be a huge (25:24):
undefined

Josh: week. There's a high probability we get GPT-5. (25:27):
undefined

Josh: And then they've also been talking about their hardware device a little bit. And they're saying how (25:30):
undefined

Josh: It's like it's genuinely going to change the world. And I believe the reason (25:34):
undefined

Josh: why is because they're taking this Apple approach where they're building the (25:37):
undefined

Josh: operating system, they're gathering the context, and then they're just they're (25:39):
undefined

Josh: able to serve it now locally on device. (25:42):
undefined

Josh: They're able to go to the cloud when they need more compute. (25:44):
undefined

Josh: And it's going to create this really cool, I think, duality of AI where you (25:47):
undefined

Josh: have your your super private local one, and then you have the big brain one, (25:51):
undefined

Josh: the big brother that's off in the cloud that does all the hard computing for you. (25:55):
undefined

Ejaaz: Well, one thing is clear. There are going to be hundreds of models and it's (25:58):
undefined

Ejaaz: going to benefit the user, you and I, for so many multiple... (26:01):
undefined

Ejaaz: It's the big company's problems to figure out how these models work together (26:05):
undefined

Ejaaz: and which ones get queried. I don't care. (26:09):
undefined

Ejaaz: Just give me the good stuff and I'm going to be happy. (26:12):
undefined

Ejaaz: Folks, OpenAI has been cooking. This was the first open source models they've (26:14):
undefined

Ejaaz: released in six years, Josh. (26:20):
undefined

Ejaaz: The last one was 2019 GPT-2, which seems like the stone age and it was only like four years ago. (26:22):
undefined

Ejaaz: Thank you so much for listening. We are pumped to be talking about GPT-5, (26:29):
undefined

Ejaaz: which we hope to be released in maybe 24 hours. (26:34):
undefined

Ejaaz: Hopefully this week, fingers crossed. I don't know, we might be back on this (26:38):
undefined

Ejaaz: camera pretty soon. Stay tuned. (26:40):
undefined

Ejaaz: Please like, subscribe, and watch out for all the updates. We're going to release (26:42):
undefined

Ejaaz: a bunch of clips as well if you want to kind of like get to the juicy bits as well. (26:45):
undefined

Ejaaz: Share this with your friends and give us feedback. If you want to hear about (26:49):
undefined

Ejaaz: different things, things that we haven't covered yet or things that we've spoken (26:52):
undefined

Ejaaz: about, but you want to get more clarity on or guests that you want to join the show, let us know. (26:55):
undefined

Ejaaz: We're going full force on this and we'll see you on the next one. (26:59):
undefined

Josh: Sounds good. See you guys soon. Peace. (27:02):
undefined

Music: Music (27:04):
undefined
Advertise With Us

Popular Podcasts

Law & Order: Criminal Justice System - Season 1 & Season 2

Law & Order: Criminal Justice System - Season 1 & Season 2

Season Two Out Now! Law & Order: Criminal Justice System tells the real stories behind the landmark cases that have shaped how the most dangerous and influential criminals in America are prosecuted. In its second season, the series tackles the threat of terrorism in the United States. From the rise of extremist political groups in the 60s to domestic lone wolves in the modern day, we explore how organizations like the FBI and Joint Terrorism Take Force have evolved to fight back against a multitude of terrorist threats.

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

NFL Daily with Gregg Rosenthal

NFL Daily with Gregg Rosenthal

Gregg Rosenthal and a rotating crew of elite NFL Media co-hosts, including Patrick Claybon, Colleen Wolfe, Steve Wyche, Nick Shook and Jourdan Rodrigue of The Athletic get you caught up daily on all the NFL news and analysis you need to be smarter and funnier than your friends.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.