OpenAI Just Gave Away Their Secret Formula... For Free?

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Josh: The unthinkable has just happened open ai (00:03):
undefined

Josh: has released an open source model open ai (00:06):
undefined

Josh: has been closed ai since the time that i knew them (00:09):
undefined

Josh: they have been named themselves open ai they were not (00:12):
undefined

Josh: open source they have finally released an open source model and surprise surprise (00:15):
undefined

Josh: it's actually really great and i think the downstream implications of an open (00:18):
undefined

Josh: source model from a company like this that is this good are really it's a really (00:22):
undefined

Josh: big deal i think this really matters a lot just yesterday they announced the release of GPT-OSS. (00:26):
undefined

Josh: There are two models. There's a 120 billion parameter model and there's a 20 (00:32):
undefined

Josh: billion parameter model. We're going to get into benchmarks. (00:36):
undefined

Josh: We're going to get into how good they are. (00:38):
undefined

Josh: But the idea is that OpenAI has actually released an open source model. (00:40):
undefined

Josh: And this can compare to the Chinese models because we recently had DeepSeek and we've had Kimi. (00:44):
undefined

Josh: And those would be very good. But this is the first really solid American-based open source model. (00:49):
undefined

Josh: So Ijaz, I know you've been kind of digging in the weeds about how this works. (00:54):
undefined

Josh: Can you explain us exactly why this is a big deal why this happened what's going on here (00:57):
undefined

Ejaaz: Yeah it's it's pretty huge so so here (01:00):
undefined

Ejaaz: are the hot highlights um as you mentioned there's two (01:03):
undefined

Ejaaz: models that came out the 20 billion parameter model which is actually small (01:06):
undefined

Ejaaz: enough to run on your mobile phone right now and they have a 120 billion parameter (01:09):
undefined

Ejaaz: model which is big but still small enough to run on a high performance laptop (01:14):
undefined

Ejaaz: so if you guys have a macbook out there jump in go for it um it's fully customizable. (01:19):
undefined

Ejaaz: So remember, open source means (01:25):
undefined

Ejaaz: that you can literally have access to the design of the entire model. (01:26):
undefined

Ejaaz: It's like OpenAI giving away their secret recipe to how their frontier models (01:30):
undefined

Ejaaz: work. And you can kind of like recreate it at home. (01:34):
undefined

Ejaaz: This means that you can customize it to any kind of use case that you want, (01:37):
undefined

Ejaaz: give it access to all your personal hard drives, tools, data, (01:41):
undefined

Ejaaz: and it can do wonderful stuff. (01:44):
undefined

Ejaaz: But Josh, here's the amazing part. (01:46):
undefined

Ejaaz: On paper, these models are as good as GPT-4 mini models, which is, (01:48):
undefined

Ejaaz: it's pretty impressive, right? (01:54):
undefined

Ejaaz: In practice and i've been playing around with it for the last few hours they're (01:56):
undefined

Ejaaz: as good in my opinion and actually quicker than (02:00):
undefined

Ejaaz: gpt-03 which is their frontier model and i (02:03):
undefined

Ejaaz: mean this across like everything so (02:06):
undefined

Ejaaz: reasoning um it spits out answers super quickly and i can see its reasoning (02:09):
undefined

Ejaaz: it happens in like a couple of seconds and i'm so used to waiting like 30 seconds (02:14):
undefined

Ejaaz: to a couple minutes on gpt-03 josh so it's pretty impressive and an insane unlock (02:18):
undefined

Ejaaz: on coding it's as good and on creativity as well. (02:23):
undefined

Ejaaz: So I'm my mind's pretty blown at all of this, right? Josh, what do you what do you think? (02:28):
undefined

Josh: Yeah, so here's why it's impressive to me is because a lot of the times I don't (02:32):
undefined

Josh: really care to use the outer bands of what a model is capable of. (02:35):
undefined

Josh: Like I am not doing deep PhD level research. I'm not solving these math Olympiad questions. (02:38):
undefined

Josh: I'm just trying to ask it a few normal questions and get some answers. (02:43):
undefined

Josh: And what these models do is an excellent job at serving that need. (02:46):
undefined

Josh: They're not going to go out and solve the world's hardest problems, (02:49):
undefined

Josh: but neither do I. I don't want to solve those problems. (02:51):
undefined

Josh: I just kind of want the information that I want, whether it be just a normal (02:54):
undefined

Josh: Google type search or whether it be asking it some miscellaneous question about (02:57):
undefined

Josh: some work that I'm doing. (03:01):
undefined

Josh: It's really good at answering that. So I think initial impressions, (03:03):
undefined

Josh: because they did allow you to test it publicly through their website, (03:07):
undefined

Josh: it's just really good at the things that I want. (03:10):
undefined

Josh: So the fact that I can run one of these models on a local device on my iPhone, (03:12):
undefined

Josh: well, it feels like we're reaching this place that AI is starting to become (03:16):
undefined

Josh: really interesting because for so long we've had compute handled fully on the (03:19):
undefined

Josh: cloud and now this is the first time where (03:23):
undefined

Josh: Compute can really happen on your computer. It could happen on your laptop. (03:26):
undefined

Josh: I could download the model and I could actually store the model, (03:28):
undefined

Josh: the 120 billion parameter model on a 56 gigabyte USB drive. (03:31):
undefined

Josh: So you can take the collective knowledge of the world and put it on a tiny little USB drive. (03:35):
undefined

Josh: And granted, it needs a bit of a bigger machine to actually run those parameters, (03:40):
undefined

Josh: but you can install all the weights. It's 56 gigabytes. (03:43):
undefined

Josh: It's this incredibly powerful package. And it probably, I don't know if this (03:46):
undefined

Josh: is true, but it's probably the most condensed knowledge base in the history of humanity. (03:51):
undefined

Josh: They've really managed to take a tremendous amount of tokens, (03:55):
undefined

Josh: smush them into this little parameter set, and then publish it for people to (03:59):
undefined

Josh: use. So for me, I'm really excited. (04:02):
undefined

Josh: I like having my own mini portable models. I am excited to download, (04:04):
undefined

Josh: try it out, run it on my MacBook. (04:08):
undefined

Josh: I'm not sure I could run the 120 billion parameter model, but at least the 20B (04:09):
undefined

Josh: and give it a shot and see how it works. (04:12):
undefined

Ejaaz: You need to get the latest MacBook, Josh. I know, I got to upgrade. We can test that out. (04:14):
undefined

Ejaaz: What I also love about it is it's fully private, right? So you can give it access (04:19):
undefined

Ejaaz: to your personal hard drive, your Apple Notes, whatever you store on your computer, basically. (04:24):
undefined

Ejaaz: And you can basically instruct the model to use those different tools. (04:31):
undefined

Ejaaz: So one review that I keep seeing from a number of people who have been testing (04:36):
undefined

Ejaaz: it so far is that it's incredibly great and intuitive at tool use. (04:39):
undefined

Ejaaz: And the reason why this is such a big deal is a lot of the Frontier models right (04:44):
undefined

Ejaaz: now, when they allow you to give access to different tools, they're kind of clunky. (04:48):
undefined

Ejaaz: The model doesn't actually know when to use a specific tool and when not to. (04:53):
undefined

Ejaaz: But these models are super intuitive, which is great. The privacy thing is also (04:56):
undefined

Ejaaz: a big thing because you kind of (05:00):
undefined

Ejaaz: don't want to be giving all your personal information away to Sam Altman. (05:02):
undefined

Ejaaz: But you want a highly personalized model. (05:05):
undefined

Ejaaz: And I think if I was to condense this entire model release in a single sentence, (05:08):
undefined

Ejaaz: Joss, I think I would say it is the epitome of privacy and personalization in an AI model so far. (05:13):
undefined

Ejaaz: It is that good. it is swift it is cheap and I'm going to replace it completely (05:20):
undefined

Ejaaz: with all my GPT-4.0 queries as you said earlier like, (05:26):
undefined

Ejaaz: Who needs to use the basic models anymore when you have access to this? (05:30):
undefined

Josh: Yeah. So it's funny you say that you're going to swap it because I don't think I'm going to swap it. (05:34):
undefined

Josh: I still am not sure I personally have a use case right now because I love the (05:38):
undefined

Josh: context. I want the memory. (05:41):
undefined

Josh: I like having it all server side where it kind of knows everything about me. (05:42):
undefined

Josh: I guess in the case that I wanted to really make it a more intimate model experience (05:46):
undefined

Josh: where you want to sync it up with like journal entries or your camera roll or (05:50):
undefined

Josh: whatever, whatever interesting like personal things, this would be a really cool use case. (05:55):
undefined

Josh: I think for the people who are curious why this matters to them, (05:59):
undefined

Josh: well, we could talk a little briefly about like the second order effects of (06:03):
undefined

Josh: having open source models as powerful, because what that allows you to do is (06:07):
undefined

Josh: to serve queries from a local machine. (06:10):
undefined

Josh: So if you are using an app or let's say you're an app developer and you're building (06:13):
undefined

Josh: an application and your app is serving millions of requests because it's a GPT wrapper. (06:17):
undefined

Josh: Well, what you could do now is instead of paying API calls to the OpenAI server, (06:22):
undefined

Josh: you can actually just run your own local server, use this model, (06:26):
undefined

Josh: and then serve all that data for the cost of the electricity. (06:29):
undefined

Josh: And that's a really big unlock for the amount of compute that's going to be (06:31):
undefined

Josh: available for not only developers, but for the cost of the users in a lot of these applications. (06:34):
undefined

Josh: So for the applications that aren't doing this crazy moon math and that are (06:38):
undefined

Josh: just kind of serving basic queries all day long, this like really significantly drops the cost. (06:42):
undefined

Josh: It increases the privacy, like you mentioned. And there's a ton of really important (06:47):
undefined

Josh: upsides to open source models that we just haven't seen up until now. (06:51):
undefined

Josh: And I'm very excited to see come forward. (06:54):
undefined

Ejaaz: Well, Josh, the thing with most of these open source models, (06:58):
undefined

Ejaaz: we spoke about actually two major Chinese open source models that were released last week. (07:01):
undefined

Ejaaz: It's not accessible to everyone. Like you and me aren't necessarily going to (07:06):
undefined

Ejaaz: go to Hugging Face, a completely separate website, download these models, (07:10):
undefined

Ejaaz: run the command line interface. (07:14):
undefined

Ejaaz: Most of the listeners on the show doesn't even know what that means. (07:16):
undefined

Ejaaz: I don't even know if I know what that means, right? (07:18):
undefined

Ejaaz: But here you have a lovely created website where you could just kind of log (07:20):
undefined

Ejaaz: on and play around with these open source models. And that's exactly what I've been doing. (07:25):
undefined

Ejaaz: I actually have a few kind of demo queries that I ran yesterday, Josh. (07:28):
undefined

Josh: Yeah, walk us through, let's see. (07:34):
undefined

Ejaaz: Okay, so there's an incredibly complex test, which a lot of these AI models, (07:35):
undefined

Ejaaz: which cost hundreds of billions of dollars to train, can't quite answer. (07:41):
undefined

Ejaaz: And that is how many R's, the letter R's are there in the word strawberry? Most say two. (07:45):
undefined

Josh: The bar's on the floor, (07:52):
undefined

Ejaaz: Huh? Yeah, if we were to go with most models, they say two. They're convinced that they are only two. (07:53):
undefined

Ejaaz: And I ran that test today, rather yesterday, with these open source models, (07:58):
undefined

Ejaaz: and it correctly guessed three, Josh. So we're one for one right now. (08:03):
undefined

Josh: We're on our way. (08:07):
undefined

Ejaaz: But then I was like, okay, we live in New York City. I love this place. (08:08):
undefined

Ejaaz: I'm feeling a little poetic today. Can you write me a sonnet? (08:12):
undefined

Ejaaz: And my goal with this wasn't to test whether it could just write a poem. (08:16):
undefined

Ejaaz: It was to test how quickly it could figure it out. (08:19):
undefined

Ejaaz: And as you see it thought for a couple of seconds on this so it literally spat (08:22):
undefined

Ejaaz: this out in two seconds um and it was structured really well you know it kind (08:26):
undefined

Ejaaz: of flowed would i be you know reciting this out loud to the public no but you (08:30):
undefined

Ejaaz: know i was pretty impressed. (08:34):
undefined

Ejaaz: And then, Josh, I was thinking, you know, what's so unique about open source models? (08:35):
undefined

Ejaaz: You just went through a really good list of why open source models work. (08:40):
undefined

Ejaaz: But I was curious as to why these specific open source models were better than (08:44):
undefined

Ejaaz: other open source models or maybe even other centralized models. (08:48):
undefined

Ejaaz: So I wrote a query. I decided to ask it. I was like, you know, (08:51):
undefined

Ejaaz: tell me some things that you could do that are the larger centralized models. (08:54):
undefined

Ejaaz: And I spat out a really good list. I'm not going to go through all of them, (08:57):
undefined

Ejaaz: but, you know, some of the things that we've highlighted so far, you can fine tune it. (09:00):
undefined

Ejaaz: It's privacy. See, I really like this point that it made, Josh, (09:03):
undefined

Ejaaz: that it just shows that AI is probably getting smarter than us, (09:06):
undefined

Ejaaz: which is you can custom inject your own data into these models. (09:08):
undefined

Ejaaz: Now, without kind of digging deeper into this, when you use a centralized model, (09:13):
undefined

Ejaaz: it's already pre-trained on a bunch of data that companies like Anthropic and (09:18):
undefined

Ejaaz: Google have already fed it. (09:23):
undefined

Ejaaz: And so it's kind of formed its own personality, right? (09:25):
undefined

Ejaaz: So you can't change the model's personality on a centralized model. (09:29):
undefined

Ejaaz: But with an open model you have full reign to do whatever you want and so if (09:32):
undefined

Ejaaz: you were feeling kind of uh adventurous you could use your own data and make (09:37):
undefined

Ejaaz: it super personal and customizable so i thought that was really cool and fun (09:41):
undefined

Ejaaz: demo josh have you been playing around with this. (09:44):
undefined

Josh: Yeah it's um it's it's smart it's fun it's smart i wouldn't say it's anything (09:46):
undefined

Josh: novel the like query results that i get are you know on par with everything (09:50):
undefined

Josh: else i don't notice the difference which is good because it means they're performing (09:54):
undefined

Josh: very well it's not like i feel like i'm getting degraded performance because (09:57):
undefined

Josh: I'm using a smaller model. (10:01):
undefined

Josh: But it's just like it's nothing too different, I would say. (10:02):
undefined

Josh: The differences, I mean, again, all this boils down to the differences of it (10:06):
undefined

Josh: being open source versus being (10:09):
undefined

Ejaaz: Run on the server. Well, let me challenge you that, right? OK, (10:10):
undefined

Ejaaz: so you're saying it's good but nothing novel. (10:13):
undefined

Ejaaz: Would you say it's as good as GPT-4.0, (10:16):
undefined

Ejaaz: minus the memory let's just put memory aside for a second would you use it if (10:20):
undefined

Ejaaz: it had memory capability. (10:24):
undefined

Josh: Actually no probably not um i still wouldn't (10:26):
undefined

Josh: because i love my desktop application too much i (10:29):
undefined

Josh: love my mobile app too much and i like that the conversations are (10:32):
undefined

Josh: shared in the cloud um so i can use them on my phone i could (10:35):
undefined

Josh: start on my laptop and go back and forth so even in (10:38):
undefined

Josh: that case i'm probably still not a user um because (10:41):
undefined

Josh: the convenience factor but there are there are a (10:44):
undefined

Josh: lot of people and a lot of industries that would be and this is actually something probably (10:47):
undefined

Josh: worth surfacing is the new industries that are now able to (10:50):
undefined

Josh: benefit from this because a lot of industries have (10:52):
undefined

Josh: a tough time using these AI models because (10:55):
undefined

Josh: of the data privacy concerns particularly I mean if you think about a (10:58):
undefined

Josh: healthcare industry people who are dealing with patients data it's (11:01):
undefined

Josh: very challenging for them to fork it over to open AI and just trust that they're (11:04):
undefined

Josh: going to keep it safe so what this does is it actually allows companies that (11:07):
undefined

Josh: are in like the healthcare industry the finance industry who's dealing with (11:11):
undefined

Josh: very like high touch personal finance the legal industry who's dealing with (11:13):
undefined

Josh: a lot of legality government and defense a lot of these industries that were (11:16):
undefined

Josh: not previously able to use these popular AI models, (11:20):
undefined

Josh: well, now they have a pretty good model that they could run locally on their machines. (11:24):
undefined

Josh: And that doesn't have any possibility of actually leaking out their customer (11:27):
undefined

Josh: data, leaking out financials or healthcare data or, or like any sort of legal documents. (11:31):
undefined

Josh: And, and that feels like a super powerful unlock. So for them, (11:35):
undefined

Josh: it feels like a no brainer, obviously get the 120 B model running on a local (11:38):
undefined

Josh: machine inside of your office, and you can load it up with all this context. (11:43):
undefined

Josh: And that seems to be who this would be most impacting, right? (11:46):
undefined

Ejaaz: But still to that point, I wonder how many of these companies can be bothered (11:51):
undefined

Ejaaz: to do that themselves and run their own internal kind of like infrastructure. (11:56):
undefined

Ejaaz: I'm thinking about OpenAI, who cracked, I think, $10 billion in annual recurring (12:01):
undefined

Ejaaz: revenue this week, which is like a major milestone. (12:06):
undefined

Ejaaz: And a good chunk of that, I think 33% of that is for enterprise customers. (12:09):
undefined

Ejaaz: And to your point, like these enterprise customers don't wanna be giving open (12:14):
undefined

Ejaaz: AI their entire data. You know, they can be used to train other AI models. (12:18):
undefined

Ejaaz: So their fix or solution right now is they use kind of like private cloud instances, (12:21):
undefined

Ejaaz: that I think are supplied by Microsoft by their Azure cloud service or something like that. (12:28):
undefined

Ejaaz: And I wonder if they chose that, (12:32):
undefined

Ejaaz: One, because there wasn't any open source models available or because they kind (12:35):
undefined

Ejaaz: of just want to offload that to Microsoft to deal with. (12:40):
undefined

Ejaaz: My gut tells me they're going to want to go with the latter, (12:42):
undefined

Ejaaz: which is like, you know, just give it to some kind of cloud provider to deal with themselves. (12:46):
undefined

Ejaaz: And they just trust Microsoft because it's a big brand name. (12:50):
undefined

Ejaaz: But yeah, I don't really know how they'll materialize. I still think, (12:52):
undefined

Ejaaz: and maybe this is because of my experience in crypto, Josh, that the open source (12:55):
undefined

Ejaaz: models are still for like people that are at the fringe that are really experimenting (13:00):
undefined

Ejaaz: with these things. but maybe don't have billions of dollars. (13:04):
undefined

Josh: Yeah, that could be right. It'll be interesting to see how it plays out on all (13:07):
undefined

Josh: scale of businesses because I mean, as a, like I think of a lot of indie devs (13:10):
undefined

Josh: that I follow on Twitter and I see them all the time (13:15):
undefined

Josh: just running local servers and they just, if they had this local model that (13:17):
undefined

Josh: they could run on their machine and it takes the cost per query down from like (13:20):
undefined

Josh: a penny to zero, that's like a big zero to one change. (13:24):
undefined

Josh: So he does this model special because there are also a number of breakthroughs (13:28):
undefined

Josh: that occurred in order to make this possible, (13:32):
undefined

Josh: in order to condense this knowledge to be so tight so here's this (13:34):
undefined

Josh: tweet from the professor talking about the cool tech tweaks in (13:37):
undefined

Josh: this new model and what open ai was able to achieve some of (13:40):
undefined

Josh: these i believe are novel some of these are seen before um if (13:43):
undefined

Josh: you look at point two mixture of experts we're familiar with mixture of experts (13:46):
undefined

Josh: we've seen other companies use that like kimmy and deep (13:49):
undefined

Josh: seek basically instead of one brain doing everything the ai (13:52):
undefined

Josh: has this team of experts that are kind of like mini brains (13:54):
undefined

Josh: and specialize in different tasks it picks the right expert for (13:57):
undefined

Josh: the job and it makes it faster so like instead of (14:00):
undefined

Josh: having the entire 120 million parameter model search for one question maybe (14:03):
undefined

Josh: you just take a couple million of those parameters that are really good at solving (14:08):
undefined

Josh: math problems and they use it and that that's what brings compute down the first (14:11):
undefined

Josh: point is this thing called the sliding window attention so if you imagine an (14:15):
undefined

Josh: ai is like reading a really long book (14:18):
undefined

Josh: It can only focus on a few pages at a time this trick (14:21):
undefined

Josh: kind of lets it slide its focus window along the text so (14:24):
undefined

Josh: when you think of a context window generally it's fixed right where you can see (14:27):
undefined

Josh: a fixed set of data this sliding window (14:30):
undefined

Josh: attention allows you to kind of move that context back and forth a (14:33):
undefined

Josh: little bit so it takes what would have normally been (14:36):
undefined

Josh: a narrow context window and extends it out a little bit to (14:39):
undefined

Josh: the side so you get a little bit more context which is great for a (14:42):
undefined

Josh: smaller model again you really want to consider that all of these are (14:44):
undefined

Josh: are optimized for this microscopic scale that (14:47):
undefined

Josh: can literally run on your phone and then the third point is this (14:50):
undefined

Josh: thing called rope with yarn which sounds like a cat toy but this (14:52):
undefined

Josh: is how the ai keeps track of the order of words so like the position (14:56):
undefined

Josh: of the words in a sentence um so rope (14:59):
undefined

Josh: you could imagine it like like the twisty math way to do (15:02):
undefined

Josh: it and yarn makes it stretch further for really long stuff (15:05):
undefined

Josh: so we have the context window that is (15:07):
undefined

Josh: sliding we have this rope with yarn that allows you (15:10):
undefined

Josh: to just kind of like stretch the words a little bit further and (15:13):
undefined

Josh: then we have attention sinks which is the last one which is (15:16):
undefined

Josh: there's a problem when ai is dealing with these endless chats that (15:19):
undefined

Josh: lets it it kind of sinks in or ignores the boring old (15:22):
undefined

Josh: info so it can pay attention to the new stuff so basically what it (15:25):
undefined

Josh: is is if you're having a long chat with it and it determines hey this stuff (15:28):
undefined

Josh: is kind of boring i don't need to remember it it'll actually just throw it away (15:32):
undefined

Josh: and it'll increase that context window a little bit so again hyper optimizing (15:35):
undefined

Josh: for for the small context window that it has and those are kind of the key four (15:39):
undefined

Josh: breakthroughs that made this special again i'm not sure any of them are particularly novel, (15:42):
undefined

Josh: But when combined together, that's what allows you to get these 04 mini results (15:47):
undefined

Josh: or even 03 results on the larger model on something that can run locally on your laptop. (15:52):
undefined

Josh: So it's a pretty interesting set of breakthroughs. I think a lot of times OpenAI, (15:56):
undefined

Josh: we talk about them because of their feature breakthroughs, not really their (16:01):
undefined

Josh: technical breakthroughs. (16:04):
undefined

Josh: I think a lot of times the technical breakthroughs are reserved for like the (16:05):
undefined

Josh: Kimi models or the DeepSeq models (16:08):
undefined

Josh: where they really kind of break open the barrier of what's possible. (16:09):
undefined

Josh: But I don't want to discredit OpenAI because these are pretty interesting things (16:12):
undefined

Josh: that they've managed to combine together into this like one cohesive, (16:16):
undefined

Josh: tiny little model, and then just gave it away. (16:19):
undefined

Ejaaz: Yeah. I mean, they actually have a history of front-running open source frontier breakthroughs. (16:22):
undefined

Ejaaz: If you remember when DeepSeek got deployed, Josh, one of their primary training (16:29):
undefined

Ejaaz: methods was reinforcement learning, which was pioneered by an open AI researcher, (16:34):
undefined

Ejaaz: which who probably like now works at Meta. (16:38):
undefined

Ejaaz: Yeah, and I was I was I was looking at the feature that you mentioned just not (16:40):
undefined

Ejaaz: the feature, but the breakthrough sliding window attention, and you mentioned (16:46):
undefined

Ejaaz: that it can basically toggle reasoning. (16:48):
undefined

Ejaaz: And I was pleasantly surprised to just notice that on the actual interface of (16:51):
undefined

Ejaaz: the models here, Josh, can you see over here? (16:54):
undefined

Ejaaz: You can toggle between reasoning levels of high, medium and low. (16:57):
undefined

Ejaaz: So depending on what your prompt or query is, if it is kind of like a low level (17:01):
undefined

Ejaaz: query where you're like hey just record this shopping or grocery list you know (17:04):
undefined

Ejaaz: that's probably like a medium or a low query so oh it's pretty cool to to see (17:09):
undefined

Ejaaz: that surface to the user like see it actively being used. (17:12):
undefined

Josh: Yeah, no, super cool. I think I like the fine tuning of it. (17:15):
undefined

Josh: And again, allowing you to kind of choose your intelligence levels, (17:19):
undefined

Josh: because I imagine a lot of average people just don't, a lot of average queries (17:22):
undefined

Josh: just don't need that much compute. (17:25):
undefined

Josh: So if you can toggle it for the low reasoning level and get your answers, (17:27):
undefined

Josh: that that's amazing. Super fast, super cheap. (17:30):
undefined

Ejaaz: Did you see that trending tweet earlier this week, Josh, which basically said (17:32):
undefined

Ejaaz: that the majority of ChatGPT users have never used a different model than ChatGPT 4.0? (17:37):
undefined

Josh: I haven't seen it, but that makes sense. (17:42):
undefined

Ejaaz: Yeah i i feel like the bulk of people i was chatting to (17:43):
undefined

Ejaaz: my sister yesterday and she was kind of (17:46):
undefined

Ejaaz: like using it for some research project at work and the (17:49):
undefined

Ejaaz: screenshot she sent me over was foro and i was like hey you know like (17:52):
undefined

Ejaaz: you could just run this on like a model that's like (17:55):
undefined

Ejaaz: five times better than this right uh we'll come (17:58):
undefined

Ejaaz: up with a much more creative set of ideas so just made me think that (18:01):
undefined

Ejaaz: like i don't know how many people like care that they are like (18:04):
undefined

Ejaaz: these brand new novel models and maybe um you know (18:06):
undefined

Ejaaz: this kind of like basic model is good enough for everyone i don't know (18:09):
undefined

Ejaaz: but um but moving on josh um there (18:12):
undefined

Ejaaz: was a big question that popped into my head as (18:15):
undefined

Ejaaz: soon as these models released which was are they as good (18:18):
undefined

Ejaaz: as the chinese open source models right i wanted (18:21):
undefined

Ejaaz: to get some opinions from people and and the reason (18:24):
undefined

Ejaaz: why this matters i'm just give the listeners some context (18:26):
undefined

Ejaaz: is china has been the number one (18:29):
undefined

Ejaaz: nation to put out the best open source (18:32):
undefined

Ejaaz: models over the last 12 months it started with deep seek (18:35):
undefined

Ejaaz: and then alibaba's quen models got involved (18:38):
undefined

Ejaaz: and then recently we had kimmy k2 and i think (18:42):
undefined

Ejaaz: there was another ai lab out of china which came out so they (18:44):
undefined

Ejaaz: have outside of america the highest density. (18:47):
undefined

Ejaaz: Of the top ai researchers they all come out of this one university (18:51):
undefined

Ejaaz: zinghua i believe they kind of like partially work (18:54):
undefined

Ejaaz: or train in the u.s as well so they've got this like kind of hybrid ai (18:57):
undefined

Ejaaz: mentality of how to build these models and they come up with a lot of these (19:00):
undefined

Ejaaz: frontier breakthroughs um kimmy k2 for context had uh one trillion parameters (19:03):
undefined

Ejaaz: in their model right comparing this to like 120 billion and 20 billion parameters (19:09):
undefined

Ejaaz: models from open air i was curious like does this beat them to the punch some people josh. (19:14):
undefined

Ejaaz: Don't think so okay this guy jason lee (19:20):
undefined

Ejaaz: he asks uh is the gpt oss stronger (19:23):
undefined

Ejaaz: than quen or kimmy or chinese open models and then (19:27):
undefined

Ejaaz: he later kind of quote tweets that tweet and says answer the model is complete (19:30):
undefined

Ejaaz: junk it's a hallucination machine overfit to reasoning benchmarks and has absolutely (19:35):
undefined

Ejaaz: zero recall ability so a few things he's mentioning here is one it hallucinates (19:40):
undefined

Ejaaz: a lot so it kind of makes up jargon terms, (19:45):
undefined

Ejaaz: ideas, or parameters that didn't really exist before. (19:48):
undefined

Ejaaz: Number two, he's saying that OpenAI designed this model purely so that it will (19:51):
undefined

Ejaaz: do well on the exams, which are the benchmarks that rate how these models compare to each other. (19:55):
undefined

Ejaaz: So they're saying that OpenAI optimized the model to kind of like do really (20:01):
undefined

Ejaaz: well at those tests, but actually fail at everything else, which is what people want to use it for. (20:06):
undefined

Ejaaz: And the final point that he makes is that it has zero recall ability, (20:10):
undefined

Ejaaz: which is something you mentioned earlier, Josh, which says it doesn't have memory (20:13):
undefined

Ejaaz: or context so you can have a conversation and then open up another conversation (20:16):
undefined

Ejaaz: and it's completely forgotten about the context that it has for you from that (20:20):
undefined

Ejaaz: initial conversation okay. (20:23):
undefined

Josh: So not not the best not to be unfair to open ai but it feels like they delayed (20:25):
undefined

Josh: this model a good bit of times oh yeah and they wanted it to look good and it (20:30):
undefined

Josh: intuitively makes sense to me that they would be kind of optimizing for benchmarks (20:34):
undefined

Josh: with this one um but nonetheless it's still impressive i'm seeing this big wall (20:38):
undefined

Josh: of text now what is what is this what is this post here (20:41):
undefined

Ejaaz: Well it's this post from uh one of these accounts i follow and they have an (20:44):
undefined

Ejaaz: interesting section here which says comparison to other open weights oh sick. (20:48):
undefined

Josh: Yeah what is this (20:52):
undefined

Ejaaz: So he goes while the larger gpt oss (20:53):
undefined

Ejaaz: 120 billion parameter model does not come (20:56):
undefined

Ejaaz: in above deep seek r1 so he's saying that deep seek r1 (20:59):
undefined

Ejaaz: just beats it out the park it is notable that (21:02):
undefined

Ejaaz: it is significantly smaller in both total and active (21:05):
undefined

Ejaaz: parameters than both of those models deep seek (21:08):
undefined

Ejaaz: r1 has 671 billion total parameters and (21:11):
undefined

Ejaaz: 37 billion active parameters and is released natively right which makes it 10x (21:14):
undefined

Ejaaz: larger than gpt's 120 billion parameter models but what he's saying is even (21:20):
undefined

Ejaaz: though gpt's model is smaller and doesn't perform as well as deep seek it's (21:24):
undefined

Ejaaz: still mightily impressive for its size. (21:29):
undefined

Josh: Okay that's cool because that gets back to the point we made earlier in the (21:32):
undefined

Josh: show that this is probably the most densely condensed (21:35):
undefined

Josh: however you want to say it like base of (21:41):
undefined

Josh: knowledge in the world they've used a lot of efficiency gains (21:43):
undefined

Josh: to squeeze the most out of it so in this small model (21:46):
undefined

Josh: it is i guess if we're optimizing maybe we (21:49):
undefined

Josh: can make up a metric here on the show which is like um output per (21:52):
undefined

Josh: per parameter or something like that like based on the total parameter (21:56):
undefined

Josh: count of this model it gives you the best value per (21:59):
undefined

Josh: token and that seems to be where this falls (22:03):
undefined

Josh: in line where it's not going to blow any other open source model out of the (22:06):
undefined

Josh: water but in terms of its size the fact that we can (22:08):
undefined

Josh: take a phone and literally run one of these models on a phone and (22:11):
undefined

Josh: you could go anywhere in the world with no service and have access to these models running (22:14):
undefined

Josh: on a laptop or whatever mobile device that that's super (22:17):
undefined

Josh: powerful and that's not something that is easy to do with the other open source (22:20):
undefined

Josh: models so perhaps that's the advantage that open ai has it's just the density (22:24):
undefined

Josh: of intelligence and the efficiency of these parameters that they've given to (22:28):
undefined

Josh: us versus just being this like home run open source model that is going for the frontier, (22:32):
undefined

Josh: it's just a little bit of a different approach. (22:37):
undefined

Ejaaz: Yeah, we need like a small but mighty ranking on this show, Josh, (22:40):
undefined

Ejaaz: that we can kind of like run every week when these companies release a new model. (22:44):
undefined

Ejaaz: No, but it got me thinking, if we zoomed out of that question, (22:48):
undefined

Ejaaz: right, because we're talking about small models versus large models, (22:52):
undefined

Ejaaz: parameters and how effectively they use versus other models that are bigger. (22:56):
undefined

Ejaaz: What really matters in this, Josh? In my opinion, it's user experience and how (23:01):
undefined

Ejaaz: useful these models are to my daily life, right? (23:06):
undefined

Ejaaz: At the end of the day, I kind of don't really care what size that model is unless (23:09):
undefined

Ejaaz: it's useful for me, right? It could be small, it could be personal, it could be private. (23:14):
undefined

Ejaaz: It depends on, I guess, the use case at the time. And I have a feeling that (23:17):
undefined

Ejaaz: the trend of how technology typically goes, you kind of want a really high-performant (23:22):
undefined

Ejaaz: small model, eventually. (23:29):
undefined

Ejaaz: Right? I try and think about like us using computers for the first time, (23:31):
undefined

Ejaaz: you know, back in our dinosaur age. (23:35):
undefined

Ejaaz: And then, you know, it all being condensed on a tiny metal slab that we now (23:37):
undefined

Ejaaz: use every day. And we can pretty much work from remotely from wherever. (23:42):
undefined

Ejaaz: And I feel like this is where models are going to go. They're going to become (23:45):
undefined

Ejaaz: more private. They're going to become more personal. (23:47):
undefined

Ejaaz: Maybe it'll be a combination of, you know, it running locally on your device (23:49):
undefined

Ejaaz: versus cloud inference and trusting certain providers. (23:53):
undefined

Ejaaz: I don't know how it's going to fall out, but I think Like it's not a zero to (23:57):
undefined

Ejaaz: one. It's not a black or white situation. (24:01):
undefined

Ejaaz: I don't think everyone's just going to go with large centralized models that (24:03):
undefined

Ejaaz: they can inference from the cloud. (24:06):
undefined

Ejaaz: I think it'll be a mixture of both. And how that materializes, (24:07):
undefined

Ejaaz: I don't know, but it's an interesting one to ponder. (24:09):
undefined

Josh: Yeah, I think this is funny. This is going to sound very ironic, (24:12):
undefined

Josh: but Apple was the person that got this most right. (24:15):
undefined

Ejaaz: Sorry, who's Apple again? (24:19):
undefined

Josh: Yeah, right. I mean, it sounds ridiculous to say this. And granted, (24:21):
undefined

Josh: they did not execute on this at all. (24:24):
undefined

Josh: But in theory, I think they nailed the approach initially, (24:25):
undefined

Josh: which was you run local compute where all of (24:28):
undefined

Josh: your stuff is so my iphone is the device i never (24:31):
undefined

Josh: leave without it is everything about me it is all of my messages my (24:34):
undefined

Josh: contacts all the contacts you could ever want from me and then the idea was (24:37):
undefined

Josh: they would give you a local model that is integrated and embedded into that (24:41):
undefined

Josh: operating system and then if there's anything that requires more compute well (24:44):
undefined

Josh: then they'll send the query off into the cloud but most of it will get done (24:47):
undefined

Josh: on your local device because most of it isn't that complicated and i think as (24:50):
undefined

Josh: a user when i ask myself what i want from AI. (24:54):
undefined

Josh: Well, I just want it to be my ultimate assistant. I just want it to be there (24:57):
undefined

Josh: to make my life better. And so much of that is the context. (25:00):
undefined

Josh: And Apple going with that model would have been incredible. (25:03):
undefined

Josh: It would have been so great. It would have had the lightweight model that runs (25:08):
undefined

Josh: locally, it has all the context of your life, and then it offloads to the cloud. (25:10):
undefined

Josh: I still think this model is probably the correct one for optimizing the user (25:13):
undefined

Josh: experience. But unfortunately, Apple just has not done that. (25:17):
undefined

Josh: So it's up for grabs. I mean, again, Sam Altman's been posting a lot this week, (25:20):
undefined

Josh: we do have to tease what's coming because this is probably going to be a huge (25:24):
undefined

Josh: week. There's a high probability we get GPT-5. (25:27):
undefined

Josh: And then they've also been talking about their hardware device a little bit. And they're saying how (25:30):
undefined

Josh: It's like it's genuinely going to change the world. And I believe the reason (25:34):
undefined

Josh: why is because they're taking this Apple approach where they're building the (25:37):
undefined

Josh: operating system, they're gathering the context, and then they're just they're (25:39):
undefined

Josh: able to serve it now locally on device. (25:42):
undefined

Josh: They're able to go to the cloud when they need more compute. (25:44):
undefined

Josh: And it's going to create this really cool, I think, duality of AI where you (25:47):
undefined

Josh: have your your super private local one, and then you have the big brain one, (25:51):
undefined

Josh: the big brother that's off in the cloud that does all the hard computing for you. (25:55):
undefined

Ejaaz: Well, one thing is clear. There are going to be hundreds of models and it's (25:58):
undefined

Ejaaz: going to benefit the user, you and I, for so many multiple... (26:01):
undefined

Ejaaz: It's the big company's problems to figure out how these models work together (26:05):
undefined

Ejaaz: and which ones get queried. I don't care. (26:09):
undefined

Ejaaz: Just give me the good stuff and I'm going to be happy. (26:12):
undefined

Ejaaz: Folks, OpenAI has been cooking. This was the first open source models they've (26:14):
undefined

Ejaaz: released in six years, Josh. (26:20):
undefined

Ejaaz: The last one was 2019 GPT-2, which seems like the stone age and it was only like four years ago. (26:22):
undefined

Ejaaz: Thank you so much for listening. We are pumped to be talking about GPT-5, (26:29):
undefined

Ejaaz: which we hope to be released in maybe 24 hours. (26:34):
undefined

Ejaaz: Hopefully this week, fingers crossed. I don't know, we might be back on this (26:38):
undefined

Ejaaz: camera pretty soon. Stay tuned. (26:40):
undefined

Ejaaz: Please like, subscribe, and watch out for all the updates. We're going to release (26:42):
undefined

Ejaaz: a bunch of clips as well if you want to kind of like get to the juicy bits as well. (26:45):
undefined

Ejaaz: Share this with your friends and give us feedback. If you want to hear about (26:49):
undefined

Ejaaz: different things, things that we haven't covered yet or things that we've spoken (26:52):
undefined

Ejaaz: about, but you want to get more clarity on or guests that you want to join the show, let us know. (26:55):
undefined

Ejaaz: We're going full force on this and we'll see you on the next one. (26:59):
undefined

Josh: Sounds good. See you guys soon. Peace. (27:02):
undefined

Music: Music (27:04):
undefined

All Episodes

Episode Transcript

Popular Podcasts

Law & Order: Criminal Justice System - Season 1 & Season 2

Dateline NBC

NFL Daily with Gregg Rosenthal

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}OpenAI Just Gave Away Their Secret Formula... For Free?

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Law & Order: Criminal Justice System - Season 1 & Season 2

Dateline NBC

NFL Daily with Gregg Rosenthal

All Episodes

OpenAI Just Gave Away Their Secret Formula... For Free?

Law & Order: Criminal Justice System - Season 1 & Season 2