All Episodes

August 4, 2025 69 mins

In this episode, we chat with Alex Atallah, founder of OpenRouter AI, a platform that aggregates over 400 LLMs. He shares his transition from co-founding OpenSea to leading innovations in AI, addressing fragmentation in the AI model landscape. 

We discuss community engagement, model analytics, and the challenges of open-source vs. closed-source frameworks. 

Join us for insights on the future of AI and how user control can shape technological advancements at Open Router!

------
🌌 LIMITLESS HQ: LISTEN & FOLLOW HERE ⬇️
https://limitless.bankless.com/
https://x.com/LimitlessFT

------
TIMESTAMPS

0:00 Intro
2:06 Journey from OpenSea to OpenRouter
5:52 Exploring Frontiers of Technology
7:16 Patterns in New Opportunities
10:06 The Role of Enthusiast Communities
13:13 Early Innovations in AI
15:18 Insights on Model Development
19:17 Understanding OpenRouter’s Functionality
24:13 Choosing the Right Model
27:04 Benchmarking and Performance Metrics
29:27 The Importance of Token Metrics
34:24 Collaborations with Major AI Players
35:20 Open Source vs. Closed Source Models
39:19 Future Trends in Model Adoption
43:06 The Role of Innovation in AI
46:23 Comparing Global AI Talent
50:29 Data Utilization Strategies
57:18 Future of AI Agents
1:01:20 OpenRouter's Vision for the Future
1:04:04 Trends in AI and NFTs

------
RESOURCES

Alex Atallah: https://x.com/xanderatallah

OpenRouter: https://openrouter.ai/

Josh: https://x.com/Josh_Kale

Ejaaz: https://x.com/cryptopunk7213

------
Not financial or tax advice. See our investment disclosures here:
https://www.bankless.com/disclosures⁠

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Ejaaz: What if I told you there was a single website you could go to where you can (00:03):
undefined

Ejaaz: chat to any major AI model from one single interface? (00:07):
undefined

Ejaaz: It's kind of like chat GPT, but instead every prompt gets routed to the exact (00:11):
undefined

Ejaaz: AI model that will do the best job for whatever your prompt might be. (00:16):
undefined

Ejaaz: Well, on today's episode, we're joined by Alex Atala, the founder and CEO of Open Router AI. (00:20):
undefined

Ejaaz: It's the fastest growing AI model marketplace with access to over 400 LLMs, (00:25):
undefined

Ejaaz: making it the only place that really knows how people use AI models, (00:31):
undefined

Ejaaz: and more importantly, how they might use them in the future. (00:34):
undefined

Ejaaz: It's at the intersection of every single prompt that anyone writes and every (00:37):
undefined

Ejaaz: model that they might ever be. (00:42):
undefined

Ejaaz: Alex Atala, welcome to the show. How are you, man? (00:43):
undefined

Alex: Thanks, guys. Great. Thanks so much for having me on. (00:46):
undefined

Ejaaz: So it is a Monday. How does the founder of OpenRouter spend his weekend? (00:48):
undefined

Ejaaz: Presumably you know out and about chilling relaxing not at all focused on the company oh (00:53):
undefined

Alex: I usually i love weekends with no (00:58):
undefined

Alex: meetings planned and i just go to a coffee shop and just have tons of hours (01:01):
undefined

Alex: stacked in a row uh to do things that require a lot of momentum build up so (01:06):
undefined

Alex: i did that at coffee shops on saturday and sunday and then i watched blade runner again. (01:11):
undefined

Ejaaz: Again okay um well so (01:20):
undefined

Ejaaz: when we were preparing for this episode alex (01:23):
undefined

Ejaaz: um i couldn't help but think that you've had a pretty insane decade of startup (01:27):
undefined

Ejaaz: foundership right um so open router is kind of like your second major thing (01:35):
undefined

Ejaaz: that you've done but prior to doing that you were the founder and cto of OpenSea, (01:40):
undefined

Ejaaz: the biggest NFT marketplace out there. (01:45):
undefined

Ejaaz: And now you're focused on one of the biggest AI companies out there. (01:48):
undefined

Ejaaz: So it sounds like you're at kind of like the pivot point of two of the most (01:51):
undefined

Ejaaz: important technology sectors over the last decade. (01:56):
undefined

Ejaaz: Can you just give us a bit of background as to how you ended up here? (01:59):
undefined

Ejaaz: And more importantly, where you started. (02:04):
undefined

Ejaaz: Walk us through the journey of OpenSea and how you ended up at OpenRouter AI. (02:06):
undefined

Alex: Yeah, so I co-founded OpenSea with Devin Finzer the very beginning of 2018, very end of 2017. (02:10):
undefined

Alex: It was the first NFT marketplace. And... (02:20):
undefined

Alex: It was not dissimilar to OpenRouter in that there was a really fragmented ecosystem (02:25):
undefined

Alex: of NFT metadata and media that gets attached to these tokens. (02:33):
undefined

Alex: And it was the first example of something in crypto that could be non-fungible, (02:40):
undefined

Alex: meaning it's a single thing that can be traded from person to person. (02:47):
undefined

Alex: Most things in the world are non-fungible. A chair is non-fungible. a (02:51):
undefined

Alex: currency is fungible so it was (02:54):
undefined

Alex: back back in 2018 no (02:58):
undefined

Alex: one was really thinking about crypto in terms of non-fungible goods (03:02):
undefined

Alex: and uh and the problem with the non with (03:05):
undefined

Alex: non-fungible goods is that there weren't any real standards set up (03:08):
undefined

Alex: um there was a lot of heterogeneous like (03:11):
undefined

Alex: implementations for how to get uh like (03:15):
undefined

Alex: a non-fungible item represented and tradable in a decentralized way So OpenSea (03:18):
undefined

Alex: organized this like very heterogeneous inventory and put it together in one (03:24):
undefined

Alex: place. We came up with like a metadata standard. (03:33):
undefined

Alex: We did a lot of like a lot of work to really make the experience super good for each collection. (03:36):
undefined

Alex: And you see a lot of those a lot of similarities with how AI works today, (03:45):
undefined

Alex: too, where there's also just a very heterogeneous ecosystem. (03:50):
undefined

Alex: On a lot of different APIs and different features supported by language model providers. (03:53):
undefined

Alex: And Open Router similarly does a lot of work to organize it all. (04:00):
undefined

Alex: Um i was at open sea uh until 2022 um when i was kind of feeling the itch to do something new. (04:06):
undefined

Alex: And um i'm at the very end of i left in august and then chat gpt came out a few months later. (04:16):
undefined

Alex: And uh and my biggest question around that (04:24):
undefined

Alex: time was whether it was going to be a winner take all market (04:27):
undefined

Alex: because opening i was very far ahead of (04:30):
undefined

Alex: everybody else and um you know (04:33):
undefined

Alex: we had cohere command we had a couple open source models um (04:36):
undefined

Alex: but opening i was the only really usable one i (04:39):
undefined

Alex: was doing little projects to experiment (04:42):
undefined

Alex: with the gpt3 api and uh (04:45):
undefined

Alex: and then llama came out in january um really (04:48):
undefined

Alex: exciting about a tenth the size one on a (04:52):
undefined

Alex: couple benchmarks but it wasn't really chattable yet and (04:55):
undefined

Alex: uh and it wasn't until uh a few (04:59):
undefined

Alex: months later that somebody a team at (05:02):
undefined

Alex: stanford distilled it into a new (05:05):
undefined

Alex: model called alpaca um distillation means (05:09):
undefined

Alex: you you take the model and you customize it or fine tune it (05:12):
undefined

Alex: on a set of synthetic data that they (05:15):
undefined

Alex: made using chat gpt as a research project (05:18):
undefined

Alex: and uh and that was it was (05:21):
undefined

Alex: the first successful major distillation that i'm aware (05:24):
undefined

Alex: of um and it was an actually usable model i (05:27):
undefined

Alex: was like on the airplane talking to him i was like wow this (05:30):
undefined

Alex: is if it only took six hundred dollars to make something like this then you (05:33):
undefined

Alex: don't need ten million dollars to make a model there might be like tens of thousands (05:36):
undefined

Alex: hundreds of thousands of models in the future and suddenly this started to look (05:40):
undefined

Alex: like a new like economic primitive a new building block that people that kind (05:44):
undefined

Alex: of deserve their own place on the internet. (05:51):
undefined

Alex: And there wasn't one. There wasn't a place where you could discover new language (05:52):
undefined

Alex: models and see who uses them and why. (05:56):
undefined

Alex: And that's how OpenRouter got started. (05:59):
undefined

Josh: That's amazing. So one of the things that we're obsessed with on this channel (06:02):
undefined

Josh: in particular is exploring frontiers and how to properly see these frontiers (06:05):
undefined

Josh: and analyze them and understand when they're going to happen. (06:10):
undefined

Josh: And when I was going through your history, you have this talent consistently over time. (06:12):
undefined

Josh: And even as far back as early on, I read you were hacking Wi-Fi routers in a hackathon. (06:16):
undefined

Josh: You're very early to that. You were early to the NFTs. You were early to understanding (06:21):
undefined

Josh: AI and the impact that it would have. And what I'd love for you to explain is (06:25):
undefined

Josh: the thought process and the indicators you look for when exploring these new (06:28):
undefined

Josh: frontiers, because clearly there's some sort of pattern matching going on. (06:31):
undefined

Josh: Clearly you have some sort of awareness of what will be important and why it (06:34):
undefined

Josh: will be important, and then inserting yourself into that narrative. (06:38):
undefined

Josh: So are there patterns? Are there certain things that you look for when searching (06:41):
undefined

Josh: for these new opportunities and that led you to make these decisions that you have? (06:45):
undefined

Alex: I think there's there's a lot to be said for finding enthusiast communities (06:49):
undefined

Alex: and and seeing if you're going to join it. (06:55):
undefined

Alex: Like, can you be an enthusiast with them? (06:58):
undefined

Alex: Like whenever something new comes out that has like some kind of ecosystem potential, (07:02):
undefined

Alex: there's there are going to be enthusiast communities that pop up. (07:08):
undefined

Alex: And the Internet has made it self-certain. You could just join the communities. (07:11):
undefined

Alex: Um discord i think is a incredible (07:17):
undefined

Alex: and super underrated platform because (07:20):
undefined

Alex: the communities feel kind of private you're (07:24):
undefined

Alex: like getting you don't feel like you're you know (07:27):
undefined

Alex: seeing somebody trying to get s you (07:30):
undefined

Alex: know like advertise something for seo juice there's (07:33):
undefined

Alex: no seo juice in discord um it's it's (07:36):
undefined

Alex: just people talking about what they're passionate about and and it (07:40):
undefined

Alex: goes it gets really niche um and when (07:43):
undefined

Alex: you find a like an interest group in discord that (07:45):
undefined

Alex: like has to do with some some new (07:49):
undefined

Alex: piece of technology that's just being developed right now and doesn't really (07:52):
undefined

Alex: work very well at all um you get people who are just trying to figure out what (07:57):
undefined

Alex: to do with it and how to make it better and i think that's like that's the first (08:02):
undefined

Alex: core piece of magic that jumps to mind, (08:07):
undefined

Alex: there's got to be like a willingness to be weird because if you jump into any (08:11):
undefined

Alex: of these communities at face value it's stupid. (08:17):
undefined

Alex: Like oh this is like just a game or it's like a really weird game I mean I'm (08:21):
undefined

Alex: not really a collectible game so I'm going to leave right now and yeah. (08:26):
undefined

Alex: Not only do you have to be aware, but you have to be creative. (08:34):
undefined

Alex: Like, okay, these are just cats on the blockchain, and people are just trading cats back and forth. (08:36):
undefined

Alex: You can't look at the community as simply that. (08:43):
undefined

Alex: Think about what you could do with it. (08:49):
undefined

Alex: Like, what is this unlock that wasn't achievable before? (08:52):
undefined

Alex: Um and uh (08:56):
undefined

Alex: and and i think there are there are people who (08:59):
undefined

Alex: just are good who will do this and they'll join the communities (09:02):
undefined

Alex: and and brainstorm live and you can see everybody (09:05):
undefined

Alex: brainstorming uh in real time but like (09:08):
undefined

Alex: another incredible example of this was the mid-journey discord (09:11):
undefined

Alex: you know it became the (09:15):
undefined

Alex: biggest biggest server in discord by (09:18):
undefined

Alex: far uh and you know (09:21):
undefined

Alex: why did that happened well you could it started with (09:25):
undefined

Alex: something weird silly maybe (09:28):
undefined

Alex: not super useful but you could see all the (09:32):
undefined

Alex: enthusiasts like remixing and (09:34):
undefined

Alex: brainstorming live how to turn it into something beautiful (09:38):
undefined

Alex: and how to how to make it useful and um (09:41):
undefined

Alex: and then you know just explode it like i it's the most it's the it's the most (09:45):
undefined

Alex: incredible like niche community uh i think that discord has ever seen because (09:53):
undefined

Alex: of like how useless it started and how insanely exciting it became. (09:59):
undefined

Alex: So um like i mean i i (10:06):
undefined

Alex: think i saw big sleep i was like playing around with this model (10:09):
undefined

Alex: called big sleep in 2021 that uh (10:12):
undefined

Alex: let you generate images that (10:16):
undefined

Alex: look kind of like deviant art okay and (10:19):
undefined

Alex: uh you could see you could like they're all (10:22):
undefined

Alex: animated images and they none of them really made sense but you could get some (10:25):
undefined

Alex: really cool stuff not like potentially something you'd want to make your desktop (10:29):
undefined

Alex: wallpaper and if you're really like deep in some deviant art communities you (10:32):
undefined

Alex: know you kind appreciate it and so and that that that was like oh there's like (10:38):
undefined

Alex: a kernel of something here, (10:41):
undefined

Alex: and uh it took like a like another year or two before mid-journey started to (10:43):
undefined

Alex: like pick up but that was like. (10:49):
undefined

Ejaaz: Where were you seeing all of this alex like where were you scouring just random (10:51):
undefined

Ejaaz: forums or just wherever your nose told you to go (10:56):
undefined

Alex: But basically there's this twitter account I'm trying to remember what it's (10:59):
undefined

Alex: called that posts AI research papers and and like kind of tries to show what you can do with them. (11:03):
undefined

Alex: And I discovered this Twitter account in like 2021. (11:13):
undefined

Alex: And I. (11:19):
undefined

Alex: I think it was not it was it wasn't at all like related to crypto but it was (11:21):
undefined

Alex: a way you know big sleep was like the first thing i saw that used ai to generate (11:27):
undefined

Alex: things that could potentially be nfts, (11:32):
undefined

Alex: so i started experimenting around like how how much you could direct it to make (11:35):
undefined

Alex: an nft collection that would make any sense it was very very difficult um but (11:42):
undefined

Alex: that was how uh that was like the first generative and. (11:49):
undefined

Ejaaz: This was before you were even thinking about starting open router right (11:54):
undefined

Alex: Um yeah yeah this was back this was when i was (12:01):
undefined

Alex: full-time at openc um oh is (12:04):
undefined

Alex: yeah i got the it's a colic (12:07):
undefined

Alex: this twitter account all right (12:10):
undefined

Alex: i really recommend it they basically post papers and like explainate and explore (12:13):
undefined

Alex: how this paper gets useful um they post animations uh like they make they make (12:20):
undefined

Alex: ai research like kind of fun to engage with and that was that was my first experience. (12:27):
undefined

Ejaaz: Okay, so I mean that's a massive win for X or formerly as it was known back (12:33):
undefined

Ejaaz: then, Twitter as a platform, right? (12:39):
undefined

Ejaaz: It gave birth to kind of like two of the biggest technologies crypto, (12:41):
undefined

Ejaaz: also known as crypto Twitter, and now apparently all the AI research stuff which (12:44):
undefined

Ejaaz: kind of put you on to the path that led you to OpenRatter. (12:49):
undefined

Ejaaz: So if I've got this right, Alex, you were full-time at OpenSea with a (12:53):
undefined

Ejaaz: multi-billion dollar company loads of important stuff to do there, (12:59):
undefined

Ejaaz: but you still found the time to kind of scour this fringe technology because (13:02):
undefined

Ejaaz: that's what AI was at the time. (13:07):
undefined

Ejaaz: Prior to kind of GPT-2 or GPT-3, no one really knew about this. (13:09):
undefined

Ejaaz: And you were playing around with these gen AI models, these generative AI models (13:13):
undefined

Ejaaz: that would create this magical little substance and maybe it came in the form (13:17):
undefined

Ejaaz: of a pitcher or a weird little cat. (13:22):
undefined

Ejaaz: And you kind of like jumped into these niche forums of enthusiasts, (13:25):
undefined

Ejaaz: as you say, and kind of explored that further. (13:28):
undefined

Ejaaz: And it sounds like you kind of like honed that even beyond your journey from OpenSea when you left. (13:32):
undefined

Ejaaz: I remember actually meeting you in this kind of like this abbess between you (13:37):
undefined

Ejaaz: leaving OpenSea and starting OpenRouter where you were kind of brainstorming (13:43):
undefined

Ejaaz: a bunch of these ideas. And I remember a snippet from our conversation (13:48):
undefined

Ejaaz: In like one of the WeWorks here, where you just kind of like had whiteboarded a bunch of AI stuff. (13:53):
undefined

Ejaaz: And one of those things was kind of like the whole topic of inference. (13:58):
undefined

Ejaaz: And if I'm being honest with you, I had no idea what that word even meant back then. (14:02):
undefined

Ejaaz: I was extremely focused on all the NFT stuff and all the crypto stuff, (14:07):
undefined

Ejaaz: my background's in all of that. (14:12):
undefined

Ejaaz: But I just found that fascinating that you always had your nose in some of the (14:13):
undefined

Ejaaz: early communities. And I think that's a really important lesson there. (14:17):
undefined

Ejaaz: I want to pick up on something that you actually brought up when you said you (14:21):
undefined

Ejaaz: discovered kind of like your path to open router, Alex. (14:27):
undefined

Ejaaz: And that is, you said you were playing around with these early AI models. (14:31):
undefined

Ejaaz: So not the GPTs before Claude was even created. (14:36):
undefined

Ejaaz: You're playing around with these random models that you would find either on (14:40):
undefined

Ejaaz: forums, on Twitter, or on Reddit, right? and you would experiment with them. (14:43):
undefined

Ejaaz: And I find it fascinating that back then, even when GPT became a thing, (14:49):
undefined

Ejaaz: you were convinced that there would be hundreds of thousands, (14:54):
undefined

Ejaaz: or did you say hundreds of thousands of AI models? (14:57):
undefined

Ejaaz: Back then, that wasn't a normal view. (14:59):
undefined

Ejaaz: Back then, everyone was like, you need hundreds of millions of dollars. (15:02):
undefined

Ejaaz: Maybe it was tens of millions of dollars back then. And it was going to be a rich man's game. (15:06):
undefined

Alex: Yeah, it was basically the Alpaca Project that kind of put me over the sack. (15:11):
undefined

Alex: On there being many, many, many models instead of just a very small number. (15:18):
undefined

Ejaaz: And can you explain what the Alpaca project is for the audience? Yeah. (15:24):
undefined

Alex: So the Alpaca project, after Lama came out, you really could not chat with it (15:29):
undefined

Alex: very well. It was a text completion model. (15:38):
undefined

Alex: There were a couple benchmarks where it beat GPT-3. (15:41):
undefined

Alex: And... It was about a tenth the size of what most people thought GPT-3 was sized at. (15:46):
undefined

Alex: So it was a pretty incredible achievement. (15:54):
undefined

Alex: But it wasn't really like, the user experience wasn't there. (15:57):
undefined

Alex: And the Alpaca project took ChatGPT and generated a bunch of synthetic outputs. (16:01):
undefined

Alex: And then they fine-tuned Llama on those synthetic outputs. (16:09):
undefined

Alex: And this did two things to Llama. It taught it style, and it taught it knowledge. (16:14):
undefined

Alex: It taught it, like, the style is like how to chat, which was the big user experience gap. (16:21):
undefined

Alex: And it made it smarter. (16:26):
undefined

Alex: Like, you can, fine-tuning transfers both style and knowledge. (16:30):
undefined

Alex: And the model would, like, respond to things that it had, you know, (16:34):
undefined

Alex: like, the content of the synthetic data, like, was reflected in the model's (16:37):
undefined

Alex: performance on benchmarks after that point. (16:42):
undefined

Alex: So um so if you can do (16:44):
undefined

Alex: that without revealing all (16:47):
undefined

Alex: the data that goes in um now now (16:50):
undefined

Alex: there's like a way you could sell data via api without (16:54):
undefined

Alex: like like just dumping all the data out to the world and then never being able (16:57):
undefined

Alex: to to like monetize it again so there's like a brand new business model around (17:02):
undefined

Alex: data that emerges um yet like the ability to create just like work towards open intelligence, (17:07):
undefined

Alex: and uh and build like new (17:16):
undefined

Alex: architectures test them more quickly and and and (17:20):
undefined

Alex: uh uh fine-tune them quickly basically you (17:23):
undefined

Alex: can build on top of the work of giants i mean (17:26):
undefined

Alex: you don't have to start from zero every time a lot (17:29):
undefined

Alex: of like the biggest developer experience innovations just involve like giving (17:32):
undefined

Alex: developers a higher stair to start walking up so they don't have to start at (17:38):
undefined

Alex: the bottom of the staircase every single time um and you know that was like the the the big. (17:43):
undefined

Alex: Like generous give that llama had for the community um and it wasn't you know (17:52):
undefined

Alex: that wasn't the only company doing open source models, Mastral, (17:58):
undefined

Alex: came out with 7B Instruct a few months later. It was an incredible model. (18:02):
undefined

Alex: Then they came out with the first open-weight mixture of experts a few months later. (18:08):
undefined

Alex: It felt like actual intelligence, but completely open. (18:15):
undefined

Alex: And all of these provide higher and higher stairs for other developers to kind (18:19):
undefined

Alex: of like, basically to crowdsource new ideas from the whole planet. (18:24):
undefined

Alex: Uh and and let these new ideas build on (18:30):
undefined

Alex: top of really good foundations so and (18:33):
undefined

Alex: you know when that when that like whole picture started (18:37):
undefined

Alex: to form into place um it felt like okay this is going to be like a huge inventory (18:40):
undefined

Alex: situation you kind of like nft collections were a huge inventory situation obviously (18:46):
undefined

Alex: completely different really different market dynamics really different type (18:51):
undefined

Alex: of of goal that buyers have. (18:55):
undefined

Alex: And so a lot of like my early experimentation, like I made like a Chrome extension called Window AI. (18:59):
undefined

Alex: I did like a few other things were just about learning how the ecosystem works (19:06):
undefined

Alex: and like what makes it different and how the like, like what people really want, (19:11):
undefined

Alex: what developers really want. (19:16):
undefined

Josh: So that leads us to OpenRouter itself, right? So I kind of want you to help (19:17):
undefined

Josh: explain to the listeners who aren't familiar with OpenRouter what it does. (19:21):
undefined

Josh: Because I think a lot of people, the way they interact with an AI is they send (19:24):
undefined

Josh: a prompt to their model of choice. (19:27):
undefined

Josh: They use ChatGPT or they use the Grok app or they're on Gemini and they kind (19:29):
undefined

Josh: of live in these siloed worlds. (19:33):
undefined

Josh: And then the next step up from the people are those kind of who use it professionally, (19:34):
undefined

Josh: who are developers. They're interacting with APIs. (19:38):
undefined

Josh: Maybe they're not interfacing with the actual UI, but they're calling a single model. (19:40):
undefined

Josh: And OpenRouter kind of exists on top of this, right? Can you walk us through (19:44):
undefined

Josh: how it works and why so many people love using OpenRouter? (19:47):
undefined

Alex: Open Router is an aggregator and marketplace for large language models. (19:50):
undefined

Alex: You can kind of think of it as like a Stripe meets Cloudflare for both of them. (19:56):
undefined

Alex: It's like a single pane of glass. You can orchestrate, discover, (20:04):
undefined

Alex: and optimize all of your intelligence needs in one place. (20:09):
undefined

Alex: One billing provider gets you all the models. (20:14):
undefined

Alex: Uh there's like 470 plus now uh (20:17):
undefined

Alex: like all the models like they sort of implement features (20:21):
undefined

Alex: but they do it differently and they also there's (20:24):
undefined

Alex: a lot of like intelligence brownouts as andre carpoffi calls them yeah where (20:28):
undefined

Alex: models just go down all the time even the you know even the top models like (20:32):
undefined

Alex: anthropic and gemini and and open (20:36):
undefined

Alex: ai um so what we do is you know we like developers need a lot of choice. (20:39):
undefined

Alex: CTOs need a lot of reliability. (20:48):
undefined

Alex: CFOs need predictable costs. CISOs need complex policy controls. (20:50):
undefined

Alex: All of these are inputs to what we do, which is build a single pane of glass (20:55):
undefined

Alex: that makes models more reliable, lower costs, gives you more choice, and, (21:03):
undefined

Alex: and then and helps you choose between all the options for where to source your intelligence. (21:11):
undefined

Josh: How does it work uh because i would imagine like what (21:17):
undefined

Josh: each as and i on the show we frequently talk about benchmarks right where (21:20):
undefined

Josh: a certain model is the best at coding and that infers that maybe you should (21:23):
undefined

Josh: go to that model to do all of your coding needs because it's the best at it (21:27):
undefined

Josh: but it would appear as if it's not true if you're routing through a lot of different (21:30):
undefined

Josh: providers so how do you consider which provider gets routed to when and how (21:34):
undefined

Josh: to get the best result for what you're asking (21:39):
undefined

Alex: So we've taken a different approach so (21:41):
undefined

Alex: far which is instead of like focusing on (21:44):
undefined

Alex: a production router that picks (21:48):
undefined

Alex: the model for you um we try (21:50):
undefined

Alex: to help you choose the model so we (21:53):
undefined

Alex: we build lots we create lots of analytics both on (21:57):
undefined

Alex: your account and uh and on our (22:01):
undefined

Alex: rankings page to help you browse and discover the models that (22:03):
undefined

Alex: like the power users are really using successfully on (22:06):
undefined

Alex: a certain type of workload um because we (22:10):
undefined

Alex: think like developers today primarily want to (22:13):
undefined

Alex: choose the model themselves um switching between all (22:16):
undefined

Alex: families can result in like a lot like very (22:19):
undefined

Alex: unpredictable behavior but once you've (22:22):
undefined

Alex: chosen your model um we try to (22:25):
undefined

Alex: help developers not need to think about the provider there are (22:28):
undefined

Alex: like sometimes dozens of (22:31):
undefined

Alex: providers for a given model uh all kinds (22:34):
undefined

Alex: of companies including the hyperscalers like aws google vertex and azure um (22:38):
undefined

Alex: and uh like scaling startups like together fireworks deep infra um and a long (22:46):
undefined

Alex: tail of providers that provide, (22:54):
undefined

Alex: like very unique features, (22:57):
undefined

Alex: very like exceptional performance. (23:00):
undefined

Alex: There's all kinds of differentiators for them. (23:03):
undefined

Alex: So what we do is we collect them all in one place. And if you want a feature, (23:06):
undefined

Alex: you just get the providers that support it. (23:10):
undefined

Alex: If you want performance, you get prioritized to the providers that have high performance. (23:12):
undefined

Alex: If you really are cost sensitive, you get prioritized to the providers that (23:18):
undefined

Alex: are really low cost today. and we basically create all these lanes. There's. (23:22):
undefined

Alex: Innumerable ways you could get routed but (23:31):
undefined

Alex: you're in full control of the of the overall user (23:35):
undefined

Alex: experience that you're aiming for and that's (23:38):
undefined

Alex: what that's what we found that was missing from the (23:41):
undefined

Alex: whole ecosystem was just a way of doing that and uh (23:44):
undefined

Alex: and you know we get like between on average five to ten percent uptime boosts (23:47):
undefined

Alex: over going to um providers directly just by load balancing and sending you to (23:53):
undefined

Alex: the top provider that's up and able to handle your request. (24:01):
undefined

Alex: We really focus hard on efficiency and performance. (24:10):
undefined

Alex: We only add about 20 to 25 milliseconds of latency on top of your request. (24:14):
undefined

Alex: It all gets deployed very close to your servers up the edge. (24:19):
undefined

Alex: We overall get just We stack providers. (24:25):
undefined

Alex: We figure out what you can benefit from that everybody else is doing and just (24:29):
undefined

Alex: give you the power of big data as a developer just accessing your model choice. (24:37):
undefined

Josh: So it kind of allows you to harness the collective knowledge of everybody, right? (24:44):
undefined

Josh: You get all of the data, you have all of the queries, you know which yields (24:48):
undefined

Josh: the best result, and you're able to deliver the best product for them. (24:51):
undefined

Josh: Now, in terms of actual LLMs, EJ has actually pulled this up just before, which is a leaderboard. (24:54):
undefined

Josh: And I'm interested in how you guys think about LLMs, which are the best, (24:59):
undefined

Josh: how to benchmark them, and how you route people through them. (25:03):
undefined

Josh: Is there a specific... Do you believe that benchmarks are accurate, (25:06):
undefined

Josh: and do you reflect those in the way that you route traffic through these models? (25:10):
undefined

Alex: In general, we have taken the stance that we want to be the capitalist benchmark for models. (25:13):
undefined

Alex: What is actually happening? (25:25):
undefined

Alex: And part of this is that I really think both the law of large numbers and the (25:27):
undefined

Alex: enthusiasm of power users are really, really valuable for everybody else. (25:36):
undefined

Alex: Like when you're routing to (25:42):
undefined

Alex: um like clod in (25:45):
undefined

Alex: let's say you're routing to clod 4 and you're (25:50):
undefined

Alex: based in europe um there you (25:52):
undefined

Alex: know all of a sudden there might be like a huge variance in in throughput from (25:56):
undefined

Alex: one of the providers and you're only able to detect that if like some other (26:00):
undefined

Alex: users have discovered it before you and so we route around the provider that's (26:05):
undefined

Alex: like running kind of slow in Europe and send you, (26:09):
undefined

Alex: if your data policies allow it, (26:13):
undefined

Alex: to a much faster provider somewhere else. (26:16):
undefined

Alex: And that allows you to get faster performance. So, like, um... (26:18):
undefined

Alex: That's, like, on the provider level, how, like, numbers help. (26:23):
undefined

Alex: On the, like, model selection level, like, what you see on this rankings page (26:26):
undefined

Alex: here, power users will, like, when we put up a model, like, we put up a new (26:30):
undefined

Alex: model today from a new model lab called ZAI, (26:36):
undefined

Alex: like, the power users instantly discover it. (26:40):
undefined

Alex: We have this LLM enthusiast community that dives in and really figures out what (26:43):
undefined

Alex: a model is good for along a bunch of core use cases. (26:51):
undefined

Alex: The power users figure out which workloads are interesting, and then you can (26:54):
undefined

Alex: just see in the data what they're doing. And everybody can benefit from it. (26:59):
undefined

Alex: That's why we open up our data and share it for free on the rankings page here. (27:04):
undefined

Ejaaz: I'm seeing this one consistent unit across all these rankings, (27:11):
undefined

Ejaaz: Alex, which is tokens, right? (27:16):
undefined

Ejaaz: And Josh and I have spoken about this on the show before, but I'm wondering (27:18):
undefined

Ejaaz: how, like you've chosen this specific unit to measure how good or effective (27:23):
undefined

Ejaaz: these models are or how consumed or used they are. (27:28):
undefined

Ejaaz: Can you tell us a bit more as to why you picked this particular unit and what (27:32):
undefined

Ejaaz: that tells you as like the open router platform as to how a user is using a particular model? (27:35):
undefined

Alex: Yeah, I think dollars is a good metric too. (27:41):
undefined

Alex: The reason we chose tokens is primarily because we were seeing prices come down really quickly. (27:45):
undefined

Alex: Open Router has been around since the beginning of 2023. (27:57):
undefined

Alex: And I didn't want a model to be penalized in the rankings just because the prices (28:02):
undefined

Alex: are going down really dramatically now like there's a, (28:11):
undefined

Alex: There's a paradox called Jevons paradox, which is that when prices decrease like 10x, (28:17):
undefined

Alex: users' use of some component of infrastructure increases by more than 10x. (28:26):
undefined

Alex: And so maybe they didn't get 10x at all. (28:35):
undefined

Alex: But I thought there were some other advantages to using tokens, (28:39):
undefined

Alex: too. Tokens don't have this penalty and don't rely on Jevon's Paradox, (28:42):
undefined

Alex: which can have a lot of lag. (28:47):
undefined

Alex: They also are a little bit of a proxy for time. (28:49):
undefined

Alex: A model that is generating a lot of tokens and doing so for a while across a lot of users. (28:53):
undefined

Alex: It means that a lot of people are reading those tokens and actually doing something with them. (29:02):
undefined

Alex: And same goes for input. But if I really want to send an enormous number of (29:06):
undefined

Alex: documents and the model has a really, really, really tiny prompt pricing, (29:11):
undefined

Alex: I think that's still valuable and something that we want to see. (29:16):
undefined

Alex: We want to see that this model is processing an enormous number of documents. (29:19):
undefined

Alex: That's a use case that should show up in the rankings. (29:23):
undefined

Alex: And so we decided to go with tokens. We might like add dollars in the future, (29:27):
undefined

Alex: but I think tokens are, you know, they don't have this like Jevons Paradox lag. (29:33):
undefined

Alex: And there wasn't anything else. Like nobody was doing any kind of like overall analytics. (29:42):
undefined

Alex: We didn't see any other company even do it until Google did a few months ago (29:48):
undefined

Alex: where they started publishing the total amount of tokens processed by Gemini. (29:54):
undefined

Alex: So we'll see which use cases really need dollars. (30:00):
undefined

Alex: But tokens have been holding up pretty well. (30:06):
undefined

Ejaaz: Yeah, I mean, this dashboard is awesome. And I recommend anyone that's listening (30:09):
undefined

Ejaaz: to this that can't see our screen to get on OpenRouter's website and check it out. (30:14):
undefined

Ejaaz: I've been following it for the last two weeks kind of pretty rigorously, Alex. (30:20):
undefined

Ejaaz: And what I love is you can literally see... (30:25):
undefined

Ejaaz: So two weeks ago Grok 4 got released right (30:28):
undefined

Ejaaz: and Josh and I were making a ton of videos on this we were (30:31):
undefined

Ejaaz: using it with pretty much everything that we could do and (30:34):
undefined

Ejaaz: then this other model came out of China pretty much a few days after called (30:38):
undefined

Ejaaz: Kimi K2 and I was like oh yeah whatever this is just some random Chinese model (30:42):
undefined

Ejaaz: I'm not going to focus on it and then I kept seeing it in my feed and I thought (30:47):
undefined

Ejaaz: okay maybe I'll give this a go and I kind of like went straight to open rather than just (30:51):
undefined

Ejaaz: almost gauge the interest from a wider set of AI users. And I saw that it was skyrocketing, right? (30:55):
undefined

Ejaaz: And then I saw that Quen dropped their models last week. (31:03):
undefined

Ejaaz: And again, I came to Open Router and it preceded the trend, right? (31:07):
undefined

Ejaaz: People had already started using it. So I love how you describe Open Router (31:11):
undefined

Ejaaz: as this kind of like prophetic orb, (31:14):
undefined

Ejaaz: basically, where the enthusiasts and the community itself can kind of like front (31:18):
undefined

Ejaaz: run very popular trends. And I think that's a very powerful moat. (31:22):
undefined

Ejaaz: And kind of on this path, Alex, I noticed that a lot of these major model providers (31:26):
undefined

Ejaaz: see the value in this, right? (31:33):
undefined

Ejaaz: So if I'm not mistaken, OpenAI kind of like used your platform to kind of secretly (31:35):
undefined

Ejaaz: launch their Frontier model before they officially launched it, right? (31:41):
undefined

Ejaaz: Can you walk us through, you know, how that comes about and more importantly, (31:46):
undefined

Ejaaz: why they want to do that and why they chose OpenRoddy to do that? (31:51):
undefined

Alex: Uh open ai will sometimes (31:54):
undefined

Alex: give uh early access (31:58):
undefined

Alex: to their to models to some of their customers for (32:01):
undefined

Alex: testing and we asked them if they (32:04):
undefined

Alex: wanted to try a stealth model with us which we had never done before um it involved (32:08):
undefined

Alex: like launching it as under another name and seeing how users respond to it without (32:14):
undefined

Alex: having any bias or sort of inclination for against the model at the onset. (32:21):
undefined

Alex: And it would be like a new way of testing it and a new way of... (32:30):
undefined

Alex: It was like an experiment for both us and them. (32:35):
undefined

Alex: And they generously decided to take the leap of faith and try it. And we... (32:38):
undefined

Alex: Launched gpt 4.1 with (32:48):
undefined

Alex: them at and we called it quasar alpha and (32:51):
undefined

Alex: it was a million uh (32:55):
undefined

Alex: token context length model opening us first very (32:59):
undefined

Alex: very long context model and it was also optimized (33:02):
undefined

Alex: for coding and the incredible (33:07):
undefined

Alex: there were a couple incredible things that happened first (33:11):
undefined

Alex: we have this community uh of benchmarkers (33:14):
undefined

Alex: that run open source benchmarks and we give (33:18):
undefined

Alex: a lot of them grants to help fund the benchmarks (33:20):
undefined

Alex: grants of open router tokens they'll just run the (33:23):
undefined

Alex: suite of tests against all the models and some of them are very creative like (33:27):
undefined

Alex: there's one that tests uh like the ability to generate fiction there's one that (33:31):
undefined

Alex: tests um like how like whether it can make a 3d object project in Minecraft called MCBench. (33:37):
undefined

Alex: There are a few that test different types of coding proficiency. (33:45):
undefined

Alex: There's one that just focuses on how good it is at Ruby, because Ruby is, (33:50):
undefined

Alex: turns out a lot of the models are not great at Ruby. (33:54):
undefined

Alex: There are a lot of like languages that all the models are pretty bad at. (33:56):
undefined

Alex: And so we have this like long tail of very niche benchmarks, (34:00):
undefined

Alex: And all the benchmarkers ran, you know, for free their benchmarks on Quasar (34:04):
undefined

Alex: Alpha and found pretty incredible results for most of them. (34:11):
undefined

Alex: And so the model got like, you know, OpenAI got this feedback in real time. (34:16):
undefined

Alex: We kind of like helped them find it. (34:21):
undefined

Alex: And they made another snapshot, which we launched as Optimus Alpha. (34:25):
undefined

Alex: And they could compare the feedback that they got from the two snapshots. (34:32):
undefined

Alex: Um, and, and then they, and then like two weeks later, they launched GPT 4.1 live for everybody. (34:36):
undefined

Alex: So it was like, uh, uh, was it an experiment for us? (34:43):
undefined

Alex: And, and we've done it, um, again since, uh, with, uh, another model provider (34:48):
undefined

Alex: that, uh, that's still working on it. (34:54):
undefined

Alex: Um, and it, and it's kind of like a cool way of learning of like crowdsourcing, (34:58):
undefined

Alex: uh, benchmarks that you wouldn't have expected. and also getting unbiased community sentiment. (35:03):
undefined

Josh: That's great. So now when we see a new model pop up and we want to test GPT-5, (35:09):
undefined

Josh: we know where to come to to try it early. (35:13):
undefined

Josh: We'll see because rumor is it's coming soon. So we'll be, we're on your watch list. (35:16):
undefined

Josh: But having, I do want to ask you about open source versus closed source because (35:20):
undefined

Josh: this has been an important thing for us. We talk about this a lot. (35:23):
undefined

Josh: You have a ton of data on this. (35:26):
undefined

Josh: I'm looking at the leaderboards there. There are open source models that are (35:28):
undefined

Josh: doing very well, closed source. (35:30):
undefined

Josh: What are your takes in general? How do you feel about open source versus closed (35:32):
undefined

Josh: source models, particularly around how you serve them to users? (35:36):
undefined

Alex: Both models, both types of models have supply problems, but the supply problems are very different. (35:40):
undefined

Alex: Typically, what we see with closed source models is that there's there's very (35:47):
undefined

Alex: few suppliers, usually just one or two. (35:50):
undefined

Alex: Like with Grok, for example, there's Grok Direct and there's Azure. (35:54):
undefined

Alex: Um with anthropic there's anthropic direct there's google vertex there's aws (35:58):
undefined

Alex: bedrock um and then we also like deploy it in different regions like we have (36:04):
undefined

Alex: an eu deployment um for customers who'd like only want their data like to stay in the eu, (36:08):
undefined

Alex: and uh and we do custom deployments for (36:15):
undefined

Alex: the for the closed source models too to just kind of guarantee good (36:18):
undefined

Alex: throughput high and high rate limits for people um (36:21):
undefined

Alex: but uh the (36:25):
undefined

Alex: like a tricky part is that like the the demand usually the like the closed source (36:29):
undefined

Alex: malls are doing most of the tokens on open router um it's it's dominant you (36:39):
undefined

Alex: know it's probably 80-ish 70 to 80 percent closed source tokens today. (36:44):
undefined

Alex: But the open source models have a much more fragmented supply, like cell supply. (36:49):
undefined

Alex: Side order book um and and like (36:58):
undefined

Alex: the rate limits for each provider is (37:02):
undefined

Alex: like a like less stable on average um it (37:05):
undefined

Alex: usually takes a while for the hyperscalers to serve a (37:09):
undefined

Alex: new closed source a new open source model um so we so the load balancing work (37:12):
undefined

Alex: that we do on um open source models tends to be a lot more valuable the load (37:18):
undefined

Alex: balancing work that we do for closed source models tends to be very focused (37:24):
undefined

Alex: on caching and feature awareness, (37:28):
undefined

Alex: making sure you're getting clean cache hits and only transitioning over to new (37:31):
undefined

Alex: providers when your cache is expired. (37:36):
undefined

Alex: For open source models, there's way less caching. Very, very few open source (37:39):
undefined

Alex: models implement caching. (37:45):
undefined

Alex: And so switching between providers becomes more common. and (37:48):
undefined

Alex: uh like we we also track a (37:52):
undefined

Alex: lot of quality differences between the the open (37:55):
undefined

Alex: source providers some of them will deploy at lower (37:58):
undefined

Alex: quantization levels which means like it's kind of like a way of compressing (38:00):
undefined

Alex: the model um generally doesn't have an impact on the quality of the output uh (38:04):
undefined

Alex: but and yet we still see some odd things from some of the open source providers. (38:11):
undefined

Alex: And so we run tests internally to detect those outputs. And we're building up (38:18):
undefined

Alex: a lot more muscle here soon. (38:25):
undefined

Alex: So that like, they get pulled out of the routing lane and don't affect anyone. (38:27):
undefined

Josh: So closed source accounts for 80% or something like that, a very large amount. (38:33):
undefined

Josh: Do you see that changing? (38:36):
undefined

Josh: Because that post we just had, it's at nine out of the 10 fastest growing LLMs (38:38):
undefined

Josh: last week, they were open source. (38:41):
undefined

Josh: And every time it seems like China comes out with another model, (38:43):
undefined

Josh: it was Kimmy K2 a week or two ago, it kind of really pushes the frontier of open source forward. (38:47):
undefined

Josh: And the rate of acceleration of open source seems to be as fast, (38:54):
undefined

Josh: if not faster than closed source, where it's just, it's making these improvements very quickly. (38:58):
undefined

Josh: It has the benefit of being able to compound in speed because it's open source (39:02):
undefined

Josh: and everyone can contribute. (39:06):
undefined

Josh: Do you think that starts to change where the percentage of tokens you're issuing (39:08):
undefined

Josh: are from open source models versus closed source? (39:11):
undefined

Josh: Or do you continue to see a trend where it's going to be Google, (39:14):
undefined

Josh: it's going to be OpenAI that are serving a majority of these tokens to users? (39:17):
undefined

Alex: In the short term, we're likely to see open source models continue to dominate (39:20):
undefined

Alex: the fastest growing model category on OpenRouter. (39:25):
undefined

Alex: And the reason for that is that a lot of users who come for a closed source (39:29):
undefined

Alex: model, but then decide they want to optimize later, (39:36):
undefined

Alex: either they want to save on costs or try out a new model that's supposed to (39:41):
undefined

Alex: be a little bit better in some direction that their app cares about or their use case cares about, (39:49):
undefined

Alex: then they leave the closed source model and go to an open source model. (39:55):
undefined

Alex: So open source tends to be like a last mile optimization thing, (39:58):
undefined

Alex: making a big generalization because the reverse can happen too. (40:03):
undefined

Alex: And so because it's a last mile optimization thing, (40:08):
undefined

Alex: the jump from this model is not being used at all to this model is really being (40:12):
undefined

Alex: used by a couple of people who have (40:17):
undefined

Alex: left Claude 4 and want to try some new coding use case will be bigger. (40:19):
undefined

Alex: Than the closed-source models, which start at a really high base and don't have (40:29):
undefined

Alex: growth quite as dramatic. (40:34):
undefined

Alex: So the other part of your question, though, was whether there's going to be like a flippening of. (40:36):
undefined

Josh: Close or some sort of like chipping it away at that monopoly of close source tokens. (40:43):
undefined

Alex: It's hard to predict these things because, you know, (40:49):
undefined

Alex: I think like the the biggest problem today with open source models is that the (40:52):
undefined

Alex: incentives are not as strong like the model lab and the model provider. (40:58):
undefined

Alex: Um they've you know they're they're (41:05):
undefined

Alex: sort of established incentives for how to (41:08):
undefined

Alex: grow as a company and attract good high quality um ai talent and um and giving (41:10):
undefined

Alex: the model weights away impairs those incentives now like we might see yeah this (41:19):
undefined

Alex: is where we might see like decentralized providers, (41:27):
undefined

Alex: helping in the future. (41:30):
undefined

Alex: A way for like, (41:33):
undefined

Alex: uh you know like a really good incentive scheme that (41:36):
undefined

Alex: like allows high quality talent (41:40):
undefined

Alex: to work on an open source model um (41:43):
undefined

Alex: that remains open weights at least uh like could fix this i like i you know (41:46):
undefined

Alex: i stay pretty i try to stay close to the decentralized providers um and like (41:54):
undefined

Alex: learn a lot from them there's some like cool on the provider side on like on (41:58):
undefined

Alex: running inference i I think there's some really cool incentive schemes being worked on. (42:02):
undefined

Alex: But on actually developing the models themselves, I haven't seen too much, unfortunately. (42:07):
undefined

Alex: So I think if we see one, flipping in the radar. And until we do, I personally doubt it. (42:12):
undefined

Josh: TBD, do you have personal takes on how you feel about open source versus closed source? (42:20):
undefined

Josh: Because this has been a huge topic we've been debating too. It's just the ethical (42:24):
undefined

Josh: concerns around alignment and closed source models versus open source. (42:28):
undefined

Josh: When you look at the competitors, China, generally speaking, (42:32):
undefined

Josh: is associated with open source, whereas the United States is generally associated with closed source. (42:35):
undefined

Josh: And we saw Llama and Meta release the open source models, but now they're raising (42:39):
undefined

Josh: a ton of money to pay a lot of employees a lot of money to probably develop a closed source model. (42:45):
undefined

Josh: So it seems like the trends are kind of split between US and China. (42:49):
undefined

Josh: And I'm curious if you have any personal takes, even outside of OpenRouter, (42:53):
undefined

Josh: of which you think serves better for the long term outlook on, (42:55):
undefined

Josh: I mean, the position of the United States or just the general safety and alignment (43:00):
undefined

Josh: conversation around AI? (43:05):
undefined

Alex: I mean, like a very simple fundamental difference between the two is that an (43:06):
undefined

Alex: innovation in open source models can be copied more quickly than an innovation (43:12):
undefined

Alex: in closed source models. (43:17):
undefined

Alex: So in terms of velocity and like how far ahead one is over the other, (43:20):
undefined

Alex: that is like a massive structural difference. (43:26):
undefined

Alex: That means that closed source models should be theoretically always ahead until (43:29):
undefined

Alex: a really interesting incentive scheme develops, like I mentioned before. (43:37):
undefined

Alex: Uh, I think, and I think that's, you know, I don't see like evidence that that's (43:41):
undefined

Alex: going to change in terms of China versus the U S. (43:47):
undefined

Alex: Um, it's, I think it's very interesting that China has not had like a major closed source model. (43:52):
undefined

Alex: Um and i don't really (44:01):
undefined

Alex: see a great reason why i'm (44:04):
undefined

Alex: not aware of any reasons that's not that's not going (44:08):
undefined

Alex: to be going to be the case in the future um my prediction (44:11):
undefined

Alex: is that there's going to be a closed source model from china um (44:14):
undefined

Alex: and uh you know if uh uh you know if like it's possible that DeepSeas and Moonshot (44:18):
undefined

Alex: and Gwen have built up really sticky talent pools. (44:29):
undefined

Alex: But generally with talent pools, after enough years have passed, (44:36):
undefined

Alex: people quit and go and create new companies and build new talent pools. (44:41):
undefined

Alex: And so we should see some of that. It's not the case that the AI space has NDAs (44:49):
undefined

Alex: or non-competes that the hedge fund space has. (44:54):
undefined

Alex: That might happen in the future too. But assuming that the current non-compete (45:00):
undefined

Alex: culture continues, there should be more companies that pop up in China over time. (45:04):
undefined

Alex: And I'm betting that some of them will be closed source. (45:11):
undefined

Alex: And my guess is that the two nations will start to look more similar. (45:13):
undefined

Ejaaz: Yeah, I guess that's why you have Zuck dishing out 300 mil to a billion dollar (45:18):
undefined

Ejaaz: salary offers to a bunch of these guys, right? (45:24):
undefined

Ejaaz: One more question on China versus the US. I kind of agree with you. (45:27):
undefined

Ejaaz: I didn't really expect China to be the one to lead open source anything, (45:31):
undefined

Ejaaz: let alone the most important technology of our time. (45:37):
undefined

Ejaaz: Do you think is their secret source to building these models, Alex? (45:41):
undefined

Ejaaz: And I know this might be out of the forte of (45:44):
undefined

Ejaaz: open router specifically but as someone who has studied this technology for (45:48):
undefined

Ejaaz: a while now i'm struggling to figure out you know what advantage they had you (45:52):
undefined

Ejaaz: know they're discovering all these new techniques and maybe the simple answer (45:59):
undefined

Ejaaz: is like constraints right they don't have access to all of (46:03):
undefined

Ejaaz: nvidia's chips they don't have access to infinite compute so then maybe they're (46:06):
undefined

Ejaaz: forced to kind of like figure out other ways around the same kinds of problems (46:11):
undefined

Ejaaz: that western companies are focused on But it's pretty clear that America, with all its funding, (46:14):
undefined

Ejaaz: hasn't been able to make these frontier breakthroughs. (46:20):
undefined

Ejaaz: So I'm curious whether you are aware of or know some kind of technical moat (46:23):
undefined

Ejaaz: that Chinese AI researchers or these AI teams that are featuring on Open Rata (46:29):
undefined

Ejaaz: day in and day out have over the U.S.? (46:33):
undefined

Alex: Well, I don't know. (46:41):
undefined

Alex: There are certainly some that they've come up with that like DeepSeek had a (46:46):
undefined

Alex: lot of very cool inference innovations that they published in their paper. (46:52):
undefined

Alex: But a lot of what they published in the original R1 paper were things that OpenAI (46:58):
undefined

Alex: had done independently themselves many months before. (47:04):
undefined

Alex: So uh i like (47:07):
undefined

Alex: on the inference side and on (47:11):
undefined

Alex: uh some of the model side i think like deep seek we we (47:14):
undefined

Alex: had talked to their team for years before r1 came (47:18):
undefined

Alex: out they had many models before that and (47:21):
undefined

Alex: they were always like a pretty sharp optimum like (47:24):
undefined

Alex: team for doing inference um like they (47:27):
undefined

Alex: came up with like the best user experience for caching prompts (47:30):
undefined

Alex: long before deep cpr1 came out and they had very good pricing um they uh they (47:33):
undefined

Alex: were just they were like you know by far the the strongest chinese team um that (47:41):
undefined

Alex: we were aware of uh well before that happened and so i'm guessing there was like some talent. (47:47):
undefined

Alex: Uh accumulation that they were working on in china (47:53):
undefined

Alex: for people who wanted to stay in china and yeah that's (47:56):
undefined

Alex: that's a huge advantage like american companies are obviously not (48:00):
undefined

Alex: doing that there's a duck is very on (48:02):
undefined

Alex: point that a lot of this is just based on talent (48:06):
undefined

Alex: um there are a lot of (48:09):
undefined

Alex: ai is open and out there and just like and (48:12):
undefined

Alex: very composable like a big tree of knowledge (48:16):
undefined

Alex: there's a paper that comes out and it cites like (48:19):
undefined

Alex: 20 other papers and you can go and read all (48:22):
undefined

Alex: of the cited papers and then you like have kind of (48:25):
undefined

Alex: a basis for understanding the paper but you really have to (48:28):
undefined

Alex: go one level deeper and read all the cited papers two levels (48:30):
undefined

Alex: down to really understand what's going on and it's. (48:33):
undefined

Alex: Just that no very few people can do that um and (48:37):
undefined

Alex: it takes like a lot of years of experience to like actually (48:40):
undefined

Alex: apply that knowledge and learn all these (48:44):
undefined

Alex: things that have not been written in any paper at all and uh (48:46):
undefined

Alex: and there's just there's just such such it (48:50):
undefined

Alex: like a small number of people um who can (48:53):
undefined

Alex: really lead research on all the different dimensions that (48:56):
undefined

Alex: go on to making a model and uh um and (48:59):
undefined

Alex: and like the the border between china and the u.s is (49:03):
undefined

Alex: is pretty defined you have to leave china move to the u.s (49:06):
undefined

Alex: and really establish yourself here um so (49:09):
undefined

Alex: i do think there's like country arbitrage there's like (49:13):
undefined

Alex: there's you know the head the hedge fund background arbitrage there's uh there's (49:16):
undefined

Alex: there's hardware arbitrage like there's like a ton of hardware that's only available (49:22):
undefined

Alex: in china but not here vice versa that creates an opportunity um and this this (49:25):
undefined

Alex: will just continue to happen. (49:32):
undefined

Ejaaz: Yeah, I think this arbitrage is fascinating. (49:33):
undefined

Ejaaz: I read somewhere that there's probably less than 200 or 250 researchers in the (49:37):
undefined

Ejaaz: world that are worthy of working at some of these frontier AI model labs. (49:44):
undefined

Ejaaz: And I looked into some of the backgrounds of the team behind Kimi K2, (49:49):
undefined

Ejaaz: which is this recent open source model out of China, which broke all these crazy rankings. (49:54):
undefined

Ejaaz: I think it was like a trillion parameter model or something crazy like that. (50:01):
undefined

Ejaaz: And a lot of them worked at some of the top American tech companies. (50:05):
undefined

Ejaaz: And they all graduated from this one university in China. (50:08):
undefined

Ejaaz: I think it's Tsinghua, which apparently is like, you know, the Harvard of AI (50:11):
undefined

Ejaaz: in China, right? So pretty crazy. (50:16):
undefined

Ejaaz: But Alex, I wanted to shift the focus of the conversation to a point that you (50:19):
undefined

Ejaaz: brought up earlier in this episode, which is around data. (50:25):
undefined

Ejaaz: Okay, so here's the context that like Josh and I have spoken about this at length, right? (50:29):
undefined

Ejaaz: We are obsessed with this feature on OpenAI, which is memory, right? (50:35):
undefined

Ejaaz: And I know a lot of the other memory, sorry, a lot of the other AI models have memory as well. (50:39):
undefined

Ejaaz: But the reason why we love it so much is I feel like the model knows me, Alex. (50:44):
undefined

Ejaaz: I feel like it knows everything about me. It can personally curate any of my prompt. (50:49):
undefined

Ejaaz: It just gets me. It knows what I want and it just serves up to me in a platter (50:54):
undefined

Ejaaz: and off I go, you know, doing my thing. (50:59):
undefined

Ejaaz: Now, Open Router sits on top of like kind of like the query layer, right? (51:02):
undefined

Ejaaz: So you have all these people writing all these weird and wonderful prompts and (51:07):
undefined

Ejaaz: kind of routing it through on towards like different AI models. (51:12):
undefined

Ejaaz: You hold all of that data or maybe you have access to all of that data. (51:18):
undefined

Ejaaz: And I know you have something called private chat as well, where you don't have access to it. (51:21):
undefined

Ejaaz: Talk to me about like what OpenRouter and what you guys are thinking about doing (51:25):
undefined

Ejaaz: with this data, because presumably, (51:29):
undefined

Ejaaz: or in my opinion, you guys have actually the best mode, arguably better than (51:31):
undefined

Ejaaz: ChatGPT, because you have all these different types of prompts coming from all (51:36):
undefined

Ejaaz: these different types of users for all these different types of models. (51:40):
undefined

Ejaaz: So theoretically, you could spin up some of the most personal AI models for (51:43):
undefined

Ejaaz: each individual user if you wanted to. (51:47):
undefined

Ejaaz: Do I have that correct? Or am I, you know, speaking crazy? (51:49):
undefined

Alex: No, that's true. No, it's something we're thinking about. (51:54):
undefined

Alex: By default, your prompts are not logged at all. (52:00):
undefined

Alex: We don't have prompts or completions for new users by default. (52:05):
undefined

Alex: You have to toggle it on in settings. (52:09):
undefined

Alex: But the result, a lot of people do toggle it on. And as a result, (52:15):
undefined

Alex: I think we have by far the largest multi-model prompt data set. (52:21):
undefined

Alex: Uh, but what we've done today, we've barely done anything with it. (52:26):
undefined

Alex: We classify a tiny, tiny, tiny subset of it. And that's what you see in the rankings page. (52:32):
undefined

Alex: Um, but, uh, what it could be done on like a per account level is really, (52:37):
undefined

Alex: um, like three main things. (52:43):
undefined

Alex: One memory right out of the box. You can, you can get this today by like combining (52:45):
undefined

Alex: open router with like a memory as a service. We've got a couple of companies (52:51):
undefined

Alex: that do this, like Memzero and SuperMemory. (52:56):
undefined

Alex: And we can partner with one of those companies or do something similar and just (52:59):
undefined

Alex: provide a lot of distribution. (53:03):
undefined

Alex: And that basically gets you a chat GPT as a service where it feels like the (53:05):
undefined

Alex: model really knows you and the right context gets added to your prompt. (53:09):
undefined

Alex: The other things that we can do are help you select the right model more intelligently. (53:16):
undefined

Alex: There's a lot of models where there's like a super clear, like migration decision that needs to be made. (53:25):
undefined

Alex: And, and we can just see this very clearly in the data. (53:33):
undefined

Alex: But we right now we just like, you know, we have like a channel or like some (53:36):
undefined

Alex: kind of communication channel open with the customer, we can just tell them (53:41):
undefined

Alex: like, hey, and we know you're using this model a ton. (53:44):
undefined

Alex: It's been deprecated. This model is significantly better. you (53:48):
undefined

Alex: should move this kind of workload over to it or like (53:51):
undefined

Alex: this workload you'll get way better pricing if you do this um (53:54):
undefined

Alex: and and that's basically like that's the (53:58):
undefined

Alex: only sort of guidance and kind of like (54:01):
undefined

Alex: opinionated routing we've done so far and it could (54:04):
undefined

Alex: be a lot more intelligent a lot more out of the box a lot more (54:06):
undefined

Alex: built into the product um and then (54:09):
undefined

Alex: the the last thing (54:13):
undefined

Alex: we can do i mean there's there's probably tons of (54:16):
undefined

Alex: things we're not even thinking about um but (54:19):
undefined

Alex: like getting really (54:22):
undefined

Alex: really smart about how (54:26):
undefined

Alex: models and providers are responding to prompts and (54:30):
undefined

Alex: uh showing you just the really coolest (54:33):
undefined

Alex: data just like telling you (54:37):
undefined

Alex: what kinds of of prompts um are (54:40):
undefined

Alex: going to which models and how those models are replying and (54:44):
undefined

Alex: just like characterizing the reply in all kinds of interesting ways (54:48):
undefined

Alex: like did the model refuse to answer what's the refusal rate (54:51):
undefined

Alex: did the model um did the. (54:54):
undefined

Alex: Model like successfully make a tool call or did it decide to (54:57):
undefined

Alex: ignore all the tools that you passed in that's a huge one (55:00):
undefined

Alex: um did the model like pay (55:02):
undefined

Alex: attention to its context did uh you know did what did did some kind of truncation (55:06):
undefined

Alex: happening happen before you sent it to the model So there's all kinds of like (55:12):
undefined

Alex: edge cases that cause developers apps to just get dumber and they're all detectable. (55:16):
undefined

Ejaaz: I'm so happy you said that because I have this kind of like hot take, (55:25):
undefined

Ejaaz: but maybe not so hot take, which is I actually think all the Frontier models (55:31):
undefined

Ejaaz: right now are good enough to do the craziest stuff ever for each user. (55:35):
undefined

Ejaaz: But we just haven't been able to unlock it because it just doesn't have the context. (55:40):
undefined

Ejaaz: Sure, you can attach it to a bunch of different tools and stuff, (55:44):
undefined

Ejaaz: but if it doesn't know when to use the tool or how to process a certain prompt (55:48):
undefined

Ejaaz: or if the users themselves don't know how to read (55:53):
undefined

Ejaaz: the output of the AI model themselves, like you just said, we need some kind (55:57):
undefined

Ejaaz: of analytics into all of this, (56:01):
undefined

Ejaaz: then we're just kind of walking around like headless chickens almost. (56:03):
undefined

Ejaaz: So I'm really happy that you said that. One other thing that I wanted to get (56:07):
undefined

Ejaaz: your take on on the data side of things is, I just think this whole concept (56:11):
undefined

Ejaaz: or notion of AI agents is becoming such a big trend, Alex. (56:15):
undefined

Ejaaz: And I noticed a lot of Frontier Model Labs release new models that kind of spin (56:21):
undefined

Ejaaz: up several instances of their AI model. (56:28):
undefined

Ejaaz: And they're tasked with a specific role, right? (56:30):
undefined

Ejaaz: Okay, you're going to do the research. You're going to do the orchestrating. (56:33):
undefined

Ejaaz: You're going to look online via a browser, blah, blah, blah, (56:37):
undefined

Ejaaz: blah, blah. And then they coalesce together at the end of that little search (56:40):
undefined

Ejaaz: and refine their answer and then present it to someone, right? (56:44):
undefined

Ejaaz: You know, Grok4 does this, Claude does this, and a few other models. (56:48):
undefined

Ejaaz: I feel like with this data that you're describing, OpenRouter could be or could (56:52):
undefined

Ejaaz: offer that as a feature, right? (56:58):
undefined

Ejaaz: Which is essentially, you can now have super intuitive, context-rich agents (57:00):
undefined

Ejaaz: that can do a lot more than just talk to you or answer your prompts. (57:05):
undefined

Ejaaz: But they could probably do a bunch of other actions for you. (57:08):
undefined

Ejaaz: Is that a fair take, or is that something that maybe might be out of the realm of open router? (57:11):
undefined

Alex: Our strategy is to be the best inference layer for agents. (57:18):
undefined

Alex: And what I think developers want... (57:23):
undefined

Alex: Is control over how their agents work. (57:30):
undefined

Alex: And our developers at least want to use us as a single pane of glass for doing (57:35):
undefined

Alex: inference, but they want to see and control the way an agent looks. (57:42):
undefined

Alex: An agent is basically just something (57:48):
undefined

Alex: that is doing inference in a loop and controlling the direction it goes. (57:52):
undefined

Alex: So um what what (57:57):
undefined

Alex: we want to do is just like build incredible docs (58:01):
undefined

Alex: really good primitives that make that easy (58:04):
undefined

Alex: to do so that you know like i think like (58:08):
undefined

Alex: a lot of our developers are just people building agents and so (58:11):
undefined

Alex: what they want is they want the primitives to (58:14):
undefined

Alex: be solved so that they can just keep creating new (58:17):
undefined

Alex: versions and new ideas um without worrying (58:20):
undefined

Alex: about like you know re-implementing tool calling over (58:24):
undefined

Alex: and over again and um and and (58:27):
undefined

Alex: and so like at least for this is like a it's it's (58:31):
undefined

Alex: a tough problem given how many models there's like a new model or provider every (58:34):
undefined

Alex: day and uh and people actually want them and use them so uh to standardize this (58:37):
undefined

Alex: like make make these tools like really dependable um that's kind of like where (58:44):
undefined

Alex: we want to focus and uh so that like agent developers don't have to worry about it. (58:49):
undefined

Josh: As we level up towards closer and closer to getting to AGI beyond, (58:54):
undefined

Josh: I'm curious what Open Router's kind of endgame is. (58:58):
undefined

Josh: If you have one, what is the master plan where you hope to end up? (59:00):
undefined

Josh: Because the assumption is as these systems get more intelligent, (59:03):
undefined

Josh: as they're able to kind of make their own decisions and choose their own tool (59:06):
undefined

Josh: sets, what role does Open Router play in continuing to route that data through? (59:09):
undefined

Josh: Do you have a kind of master plan, a grand vision of where you see this all heading to? (59:15):
undefined

Alex: You're saying like as agents get better at choosing the tools that they use (59:19):
undefined

Alex: what what becomes our role when like the agents are really good at that yes. (59:24):
undefined

Josh: Yes and like where do you see open router fitting into the picture and what (59:31):
undefined

Josh: would be the best case scenario for this this future of open router (59:34):
undefined

Alex: Right now open routers bring your own tool, (59:38):
undefined

Alex: platform um we don't have like a (59:42):
undefined

Alex: marketplace of mcps yet uh and (59:45):
undefined

Alex: and i i do think like a lot of the i think most of the most used tools will (59:49):
undefined

Alex: be ones that developers configure themselves agents just work like they're given (59:55):
undefined

Alex: access to it like i think like a holy grail for for open router is that. (01:00:00):
undefined

Alex: The the ecosystem is going to like basically my (01:00:07):
undefined

Alex: prediction for how the ecosystem is going to evolve is that um (01:00:12):
undefined

Alex: all the models are going to be adding state and (01:00:16):
undefined

Alex: other kinds of stickiness that just make you want to stick (01:00:19):
undefined

Alex: with them so they're going to add server-side tool calls (01:00:22):
undefined

Alex: they're going to add like um you know web search that that is stateful they're (01:00:25):
undefined

Alex: going to add memory They're going to add all kinds of things that try to prevent (01:00:31):
undefined

Alex: developers from leaving and increase lock-in. (01:00:35):
undefined

Alex: And OpenRouter is doing the opposite. (01:00:42):
undefined

Alex: We want developers to not feel vendor lock-in. (01:00:45):
undefined

Alex: We want them to feel like they have choice and they can use the best intelligence, (01:00:49):
undefined

Alex: even if they didn't before. (01:00:53):
undefined

Alex: It's never too late to switch to a more intelligent model. That would be like, (01:00:56):
undefined

Alex: you know, a good always on outcome for us. (01:01:00):
undefined

Alex: And so what I think we'll end up doing is, is like partnering with other companies (01:01:05):
undefined

Alex: or building the tools ourselves if we have to, so that developers don't feel stuck. (01:01:13):
undefined

Alex: That's how I, you know, there's a lot of ways the ecosystem could evolve, (01:01:20):
undefined

Alex: but that's how I would put it in a nutshell. (01:01:23):
undefined

Josh: Okay, now there's another personal question that I was really curious about, (01:01:25):
undefined

Josh: because I was also right there with you in the crypto cycle when NFTs got absolutely (01:01:29):
undefined

Josh: huge, was a big user of OpenSea. (01:01:34):
undefined

Josh: And it was kind of this trend that went up and then went down. (01:01:36):
undefined

Josh: And NFTs kind of fizzled out, it wasn't as hot anymore, and AI kind of took the wind from the sails. (01:01:39):
undefined

Josh: And it's a completely separate audience, but a similar thing where now it's (01:01:44):
undefined

Josh: the hottest thing in the world. (01:01:48):
undefined

Josh: And i'm curious how you see the trend continuing is this a cyclical thing that (01:01:49):
undefined

Josh: has ups and downs or is this a one-way trajectory of more tokens every day more (01:01:54):
undefined

Josh: ai every day is do you see it being a cyclical thing or is this a a one-way (01:01:59):
undefined

Josh: trend towards up into the right nfts (01:02:03):
undefined

Alex: Kind of follow uh crypto in a, (01:02:06):
undefined

Alex: indirect way um when crypto (01:02:12):
undefined

Alex: has ups and downs nfts generally lag a bit (01:02:15):
undefined

Alex: but they they have similar ups and downs and um (01:02:18):
undefined

Alex: and crypto is an extremely long-term play on like building a new financial system (01:02:22):
undefined

Alex: and there are so many reasons that it's not going to happen overnight um and And they're like, (01:02:30):
undefined

Alex: it's very, very entrenched reasons. (01:02:40):
undefined

Alex: Whereas AI, there are some overnight business transformations going on. (01:02:43):
undefined

Alex: And the reason AI, I think, moves a lot, one of the reasons that AI moves a (01:02:49):
undefined

Alex: lot faster is it's just about making computers behave more like humans. (01:02:55):
undefined

Alex: So if a company already works with a bunch of humans, then there's, (01:02:59):
undefined

Alex: you know, there's some engineering that needs to be done. (01:03:05):
undefined

Alex: There's some like thinking about how (01:03:07):
undefined

Alex: to like scale this but (01:03:10):
undefined

Alex: but in general i think that it's not like (01:03:14):
undefined

Alex: after seeing what can be possible um inference (01:03:17):
undefined

Alex: will be the fastest growing operating expense for all companies (01:03:21):
undefined

Alex: it'll it'll be like oh we can just hire (01:03:24):
undefined

Alex: high-performing employees at a click of a (01:03:27):
undefined

Alex: and they they work 24 7 they (01:03:37):
undefined

Alex: scale elastically it's like you know (01:03:41):
undefined

Alex: it it's not that hard it's not like huge mental (01:03:44):
undefined

Alex: model shift it's just like a huge upgrade to the way companies work today um (01:03:47):
undefined

Alex: in most cases so it's just completely different from crypto there's there's (01:03:51):
undefined

Alex: like other than both being you know than nfts i mean other than both being new (01:03:55):
undefined

Alex: they're fundamentally very different changes. (01:04:00):
undefined

Ejaaz: You're probably one of very few people in the world right now that has crazy (01:04:04):
undefined

Ejaaz: insights to every single AI model. (01:04:11):
undefined

Ejaaz: Definitely more than the average user, right? Like I have like three or four (01:04:13):
undefined

Ejaaz: subscriptions right now and I think I'm a hotshot. (01:04:17):
undefined

Ejaaz: You get access to like 400 and what is it? 57 models right now on OpenRata. (01:04:20):
undefined

Ejaaz: So an obvious question that I have for you is (01:04:25):
undefined

Ejaaz: I'm not going to say in the next couple of years, because everything moves way (01:04:29):
undefined

Ejaaz: too quickly in this sector. (01:04:33):
undefined

Ejaaz: But over the next six months, is there anything really obvious to you that should (01:04:34):
undefined

Ejaaz: be focused on within the AI sector? (01:04:41):
undefined

Ejaaz: Maybe it's like the way that certain models should be designed, (01:04:44):
undefined

Ejaaz: or perhaps it's at the application layer that no one's talking about right now. (01:04:47):
undefined

Ejaaz: Because going on from our earlier part of the conversation, you just pick these (01:04:52):
undefined

Ejaaz: trends out really early. and I'm wondering if you see anything. (01:04:56):
undefined

Ejaaz: It doesn't have to be open-racket related. It could just be AI related. (01:04:59):
undefined

Alex: I've seen the models trending towards caring more about how resourceful they (01:05:03):
undefined

Alex: are than what knowledge they have in the bank. (01:05:11):
undefined

Alex: Not all of, I feel like a lot of the applications, I think the model labs maybe, (01:05:14):
undefined

Alex: a lot of them, I don't know how many of them really deeply believe that, (01:05:21):
undefined

Alex: but a couple of them uh talk about it and i don't think it's really hit the (01:05:24):
undefined

Alex: application space yet um because people will will ask chat gpt things and if (01:05:31):
undefined

Alex: the knowledge is wrong they think the model is stupid, (01:05:37):
undefined

Alex: and that's just kind of a bad way of evaluating a model um (01:05:40):
undefined

Alex: like whatever knowledge a person has whatever (01:05:44):
undefined

Alex: a person like where calls happen at a certain time like (01:05:48):
undefined

Alex: does not it's not a proxy for how smart they are um (01:05:51):
undefined

Alex: like the the intelligence and usefulness of a model (01:05:55):
undefined

Alex: is going to trend towards how good it is at using tools and (01:05:57):
undefined

Alex: uh and and how good it is at like paying (01:06:01):
undefined

Alex: attention to its context of a long long long long context and so it's like it's (01:06:05):
undefined

Alex: it's total memory capacity and accuracy um so i think those two things need (01:06:11):
undefined

Alex: to be like emphasized more um the. (01:06:17):
undefined

Alex: Like it might be that that models pull all (01:06:23):
undefined

Alex: of their knowledge from like online databases (01:06:26):
undefined

Alex: from like real-time uh scraped (01:06:30):
undefined

Alex: index indices of the web along with a (01:06:34):
undefined

Alex: ton of real-time updating data sources um and (01:06:36):
undefined

Alex: they're never they're always kind of like relying on some some sort of database (01:06:40):
undefined

Alex: for knowledge but relying on their reasoning process for for tool calling you (01:06:44):
undefined

Alex: know like we we put it We spend probably the plurality of our time every week (01:06:49):
undefined

Alex: on tool calling and figuring out how to make it work really well. (01:06:56):
undefined

Alex: Humans, the big difference between us and animals is that we're tool users and tool builders. (01:06:59):
undefined

Alex: And that's where human acceleration and innovation has happened. (01:07:06):
undefined

Alex: So how do we get models creating tools and using tools very, (01:07:11):
undefined

Alex: very effectively? there's very little, (01:07:19):
undefined

Alex: There are very few benchmarks. There's very little priority. (01:07:22):
undefined

Alex: There's the Tau Bench for measuring how good a model is at tool calling. (01:07:25):
undefined

Alex: But there's, and there's like maybe a few others. (01:07:29):
undefined

Alex: There's Swee Bench for measuring how good a model is at multi-turn programming tasks. (01:07:32):
undefined

Alex: It's very, very hard to run, though. It costs like, you know, (01:07:39):
undefined

Alex: for Sonnet, it could cost like $1,000 to run it. (01:07:43):
undefined

Alex: And it's like the user experience for kind of like evaluating the real intelligence (01:07:48):
undefined

Alex: of these models is not good. (01:07:54):
undefined

Alex: And so like I love, as much as we don't have benchmarks listed on OpenRouter (01:07:56):
undefined

Alex: today, I love benchmarks. (01:08:00):
undefined

Alex: And I think like the app ecosystem and like developer ecosystem should spend (01:08:02):
undefined

Alex: a lot more time making very cool and interesting ones. (01:08:07):
undefined

Alex: Also, we will give credit grants for all the best ones. So I highly encourage it. (01:08:10):
undefined

Ejaaz: Well, Alex, thank you for your time today. I think we're coming up on a close (01:08:17):
undefined

Ejaaz: now. That was a fascinating conversation, man. (01:08:22):
undefined

Ejaaz: And I think your entire journey from just non-AI stuff, so OpenSea all the way (01:08:25):
undefined

Ejaaz: to OpenRouter has just been a great indicator of where these technologies are (01:08:31):
undefined

Ejaaz: progressing and more importantly, where we're going to end up. (01:08:37):
undefined

Ejaaz: I'm incredibly excited to see where OpenRatter goes beyond just prompt routing. (01:08:39):
undefined

Ejaaz: I think some of the stuff you spoke about on the data side of things is going (01:08:45):
undefined

Ejaaz: to be fascinating and arguably one of your bigger features. So I'm excited for future releases. (01:08:49):
undefined

Ejaaz: And as Josh said earlier, if GPT-5 is releasing through your platform first, (01:08:53):
undefined

Ejaaz: please give us some credits. We would love to use it. (01:08:59):
undefined

Ejaaz: But for the listeners of this show, as you know, we're trying to bring on the (01:09:03):
undefined

Ejaaz: most interesting people to chat about AI and Frontier Tech. We hope you enjoyed this episode. (01:09:07):
undefined

Ejaaz: And as always, please like, subscribe, and share it with any of your friends (01:09:14):
undefined

Ejaaz: who would find this interesting. And we'll see you on the next one. Thanks, folks. (01:09:18):
undefined
Advertise With Us

Popular Podcasts

Law & Order: Criminal Justice System - Season 1 & Season 2

Law & Order: Criminal Justice System - Season 1 & Season 2

Season Two Out Now! Law & Order: Criminal Justice System tells the real stories behind the landmark cases that have shaped how the most dangerous and influential criminals in America are prosecuted. In its second season, the series tackles the threat of terrorism in the United States. From the rise of extremist political groups in the 60s to domestic lone wolves in the modern day, we explore how organizations like the FBI and Joint Terrorism Take Force have evolved to fight back against a multitude of terrorist threats.

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

NFL Daily with Gregg Rosenthal

NFL Daily with Gregg Rosenthal

Gregg Rosenthal and a rotating crew of elite NFL Media co-hosts, including Patrick Claybon, Colleen Wolfe, Steve Wyche, Nick Shook and Jourdan Rodrigue of The Athletic get you caught up daily on all the NFL news and analysis you need to be smarter and funnier than your friends.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.