All Episodes

January 30, 2025 47 mins

Send us a text

In this jam-packed episode of Sidecar Sync, Amith and Mallory break down two of the biggest AI stories of the moment: DeepSeek’s R1 model and OpenAI’s new Operator agent. R1 has been making waves with its impressive reasoning abilities, low training costs, and major impact on AI stocks—leaving industry giants scrambling. Meanwhile, OpenAI’s Operator brings a new level of automation, allowing AI to browse the web and complete tasks like a human. What does this mean for associations and AI adoption? Tune in as we explore the implications, risks, and opportunities these developments bring to the table.

🔎 Check out Sidecar's AI Learning Hub and get your Association AI Professional (AAiP) certification:
https://learn.sidecar.ai

📕 Download ‘Ascend 2nd Edition: Unlocking the Power of AI for Associations’ for FREE
https://sidecar.ai/ai

📅 Find out more digitalNow 2025 and register now:
https://digitalnow.sidecar.ai/ 

🛠 AI Tools and Resources Mentioned in This Episode:

DeepSeek R1 ➡ https://www.deepseek.com/
OpenAI Operator ➡ https://openai.com
Claude’s Computer Use (Anthropic) ➡ https://claude.ai
Llama 3.3 (Meta AI) ➡ https://ai.meta.com/llama

Chapters:
00:00 - Introduction
03:44 - The DeepSeek R1 model: why it's a big deal
13:35 - Can we trust R1’s $6 million training claim?
17:22 - The future of AI: nearly free and everywhere
24:32 - Will R1 be used in Sidecar’s AI tools?
31:11 - OpenAI Operator: AI that can use the web for you
37:38 - Should you be cautious about AI agents?
42:17 - What Operator means for the future of work
44:46 - Closing thoughts

🚀 Sidecar on LinkedIn
https://www.linkedin.com/company/sidecar-global/

👍 Like & Subscribe!
https://x.com/sidecarglobal
https://www.youtube.com/@SidecarSync
https://sidecar.ai/

Amith Nagarajan is the Chairman of Blue Cypress https://BlueCypress.io, a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.

📣 Follow Amith:
https://linkedin.com/amithnagarajan

Mallory Mejias is the Manager at Sidecar, and she's passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space.

📣 Follow Mallory:
https://linkedin.com/mallorymejias

...
Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Amith (00:00):
AI will be in every single experience in life, and
if you don't provide it, you'regoing to seem like you don't
like.
It'll be as remarkable as ifsomeone didn't have a website
today.
Right?
Welcome to Sidecar Sync, yourweekly dose of innovation.
If you're looking for thelatest news, insights and
developments in the associationworld, especially those driven

(00:23):
by artificial intelligence,you're in the right place,
especially those driven byartificial intelligence, you're
in the right place.
We cut through the noise tobring you the most relevant
updates, with a keen focus onhow AI and other emerging
technologies are shaping thefuture.
No fluff, just facts andinformed discussions.
I'm Amit Nagarajan, Chairman ofBlue Cypress, and I'm your host
.
Greetings and welcome to theSidecar Sync, your home for

(00:47):
content all about associationsand artificial intelligence.
My name is Amit Nagarajan.

Mallory (00:52):
And my name is Mallory Mejiaz.

Amith (00:55):
And we are your hosts.
Now, before we get into ourexciting episode, let's take a
moment to hear a quick word fromour sponsor.

Mallory (01:03):
If you're listening to this podcast right now, you're
already thinking differentlyabout AI than many of your peers
, don't you wish there was a wayto showcase your commitment to
innovation and learning?
The Association AI Professional, or AAIP, certification is
exactly that.
The AAIP certification isawarded to those who have
achieved outstanding theoreticaland practical AI knowledge.

(01:26):
As it pertains to associations.
Earning your AAIP certificationproves that you're at the
forefront of AI in yourorganization and in the greater
association space, giving you acompetitive edge in an
increasingly AI-driven jobmarket driven job market.
Join the growing group ofprofessionals who've earned

(01:47):
their AAIP certification andsecure your professional future
by heading to learnsidecarai.
Amit, it has been a crazy weekin the world of AI.
How are you doing on this fineWednesday?

Amith (01:59):
I'm doing great, other than the fact that I can't see a
whole lot, because I just gotdone going to the
ophthalmologist and they didthis medieval stuff to me and
put stuff in my eyes, and so Iwalked out into the street in
New Orleans it was a bright,sunny day I had to walk a couple
blocks and forgot my sunglasses.
So, other than all that I amdoing fantastic, the world of AI
was not top of mind when Icouldn't see.

(02:19):
Let's put it that way.
How about you?

Mallory (02:22):
I thought the world of AI was always top of mind for
you, Amit.
I'm doing pretty well.
It's a nice what 60 degrees herein Atlanta.
I'm happy to just go outsidewithout a puffer coat on and
walk my dog, so I feel likethings are good.
And, honestly, I've beenexcited for this episode, as I
am every week, because Irealized I don't fully process

(02:44):
all this AI news until you and Ido this podcast.
It's almost like this is ourtime to really digest everything
, break it down.
So I started telling my husbandabout R1 and then realized wait
, wait, wait.
I need to do the podcast first.
We'll come back to this afterthat and I'll have more.

Amith (03:00):
Yeah, that's awesome.
Well, yeah, and when you haveto present a content or a topic
or whatever you're doing, itdefinitely forces your mind to
think a little bit differently.
There's probably a differentneural pathway that you use when
you're teaching, training,communicating anything that's
even somewhat formal.
In this podcast, we keep itpretty light, but still we're
presenting to a whole bunch ofpeople, and so it makes you kind

(03:20):
of distill down the content alittle bit differently.
You have to think about howyou're presenting it, of course,
but also what's the right wayto summarize things, because you
and I both spend countlesshours every week reading content
on AI, thinking about how itapplies to associations, talking
with association leaders,building educational content.
So, yeah, it's a greattouchpoint each week.

(03:43):
I really enjoy it.

Mallory (03:44):
And if you have been living under a rock everyone you
probably wouldn't know whatwe're talking about today, but I
already kind of teased it.
Our first topic for thisepisode is DeepSeek's R1 model,
which has been everywhere allover my LinkedIn, every news
post I see is about this model.
So we're going to spend a goodbit of our time today talking
about that, and then we willalso be talking about OpenAI's

(04:06):
operator agent, which you allknow we love talking about
agents on this pod.
So that's going to be a fundiscussion as well.
But first let's kick it offwith the infamous DeepSeek R1.
So, to give you some context,deepseek is a Chinese artificial
intelligence company and itreleased its R1 model on January
20th of this year.

(04:26):
Deepseek is a relatively newplayer in the AI field, having
been founded in May of 2023 as aspinoff of the Chinese
quantitative hedge fund HiFlyer.
The company has quickly gainedattention in the AI community
for its high-quality languagemodels and innovative approaches
to model training.
So as I went through this topicfor today, I kind of broke out

(04:47):
what I thought were the reasonswhy there's so much news around
R1.
Amit, I'll be interested to seeif you're in agreement.
One of those is performance,and then cost, and then the fact
that it's open source and thenfinally, of course, its impact
on the USS stock market.
But starting with performance,r1 outperforms OpenAI's O1 model

(05:07):
, anthropix's Claude 3.5 Sonnetmodel and other models in the
majority of reasoning benchmarks.
During training, r1 developedthe ability to spend more time
on complex problems byre-evaluating its initial
approach, a behavior thatactually emerged naturally.
Something interesting about R1is it explains its thought

(05:27):
process as it works throughproblems and questions, which is
interesting to see, and it hasbeen shown to complete some
interesting puzzles or questionsthat have traditionally stumped
AI models in the past.
One of those is how many R'sare in the word strawberry,
which I think we've talked abouton the pod before.
It's that puzzle question isactually the inspiration behind

(05:49):
the name Project Strawberry,which was formerly QSTAR, which
is now known as OpenAI's O1.
So it can answer that question.
And then also on LinkedIn, Isaw a post by Mark Heaps, chief
Tech Evangelist at Grok Grokwith a Q, and he said he likes
to run this puzzle with AImodels to see if they can get it
correct.
It's really simple to the humanbrain, but there are three

(06:11):
people in a room One is reading,one is playing chess.
What is the third person likelydoing?
The answer is playing chess,because it takes two people to
play chess.
And he said in most scenarios,ai models will create elaborate
stories to answer that question.
And R1 was able to get it right, it said, went through its
thought process and said well,that person must be playing

(06:32):
chess.
That's the performance summary.
Now, talking about cost, deepseaclaimed to have developed R1
for only $6 million, which weknow is significantly less than
the up to billions invested byUS tech giants.
Interestingly enough, the costto train the R1 model is
actually less than some leadersmake at some of these big AI

(06:55):
companies in the US.
That reduced cost trickles down, so R1 is significantly cheaper
to use than O1, for example,costing about 27 times less for
input and output tokens.
As I mentioned, r1 is also opensource, which means it's poised
to accelerate innovation in theAI sector and potentially
disrupt the current landscapedominated by closed source

(07:16):
models.
And then we saw quite the stockmarket impact.
So the release of the R1 modelcaused significant disruption,
particularly affectingAI-related stocks.
On January 27th, the tech-heavyNASDAQ index fell by
approximately 3%, resulting in a$1 trillion loss in market
capitalization.
Nvidia, which is a leading AIchip manufacturer, stock

(07:39):
plummeted nearly 17% on January27th.
The single-day drop wiped outalmost $600 billion from
NVIDIA's market value, markingthe largest daily loss in Wall
Street history, and we know thismodel's ability to achieve high
performance with less advancedhardware challenged the
perceived value of premium AIchips.
We also saw impacts on othertech companies like Microsoft

(08:02):
and Alphabet, which is Google'sparent company.
Now all of you can accessDeepSeek's R1 model on desktop
and on mobile.
You can go download the app ifyou would like to.
There's probably questionsthere on whether you should or
not and it's also available onthe Hugging Face platform.
So that's a lot of information,amit, people are losing their

(08:22):
minds over R1.
I tried to break it down intowhy I think that is, but I'm
interested to hear your take.

Amith (08:30):
Well, that was a great summary, mallory, and I think
that ultimately, if you can getsomething that's perceived to be
worth X for a fraction of X,that's interesting.
And if people spend a lot oftime and money building that
first product that now you candeliver at a fraction of the
cost, it undermines theperceived value of the thing

(08:51):
that people are selling to beginwith, on the one hand, but it
also questions like what are theproduction methods?
Right, in this case, reallyadvanced chips and lots of them.
And so NVIDIA got hammeredbecause the perception was that
if you can train R1, which is,in concept at least, as good as
O1, there's independentverification happening, and some
has already happened.
It seems to be legit.

(09:12):
But if that's true, then whatdoes that mean for the future of
the most advanced frontiermodels, right?
So what does that mean for thefuture of frontier labs,
investments and the mostadvanced models?
So, a couple of things.
Does that mean for the futureof Frontier Labs, investments
and the most advanced models?
So, a couple of things.
The trend line is somethingwe've been covering on this pod
and our other forms of contenton Sidecar for some time, which

(09:33):
is I've been one to say that I'mmore excited about small to
midsize models than I am aboutlarge models, and it isn't that
I don't get excited so muchabout these new 0103, other
things coming out.
I do get excited about thosethings because it's always
interesting to see what theabsolute edge of capabilities
are.
But models today, they'realready sufficiently advanced to

(09:55):
do so much more than we do withthem, and so the application
development opportunities thatexist on top of even current
models like 4.0 and Lama 3.3 andothers that are out there that
are becoming cheaper and cheaper, are mind-boggling.
You can remake the entire way abusiness operates with current
AI, even if it didn't get betterat all, and so, as those

(10:16):
capabilities that were oncefrontier now become available
not only in open source but insmaller and faster and cheaper
models, that's really good foreveryone, because that makes the
incremental cost per unit of AI, if you will, dramatically more
accessible for everyone onEarth.
And there's this thing calledJavon's paradox, which is this

(10:37):
idea that as a technologybecomes more efficient and
therefore it becomes lessexpensive, demand radically
increases, and we've seen thatto be true across pretty much
every technology that ourspecies has invented over time.
As we've been able to findefficiency in production.
We've found decrease in costand we've found massive
increases in demand.

(10:57):
Think about automotive sector,think about energy utilization,
think about traditionalcomputing and now AI.
So we're going to see that and,as a result of that, that means
there's going to be a massiveincrease in demand.
What we've seen so far is atiny sliver of the demand that
anyone's expecting, but this isa nonlinear curve, meaning that

(11:17):
it's far greater in its impactthan what people tend to
perceive.
That's why these exponentialsare so hard to visualize, tend
to perceive.
That's why these exponentialsare so hard to visualize.
And so, ultimately, I don'tthink there's anything to worry
about.
If you're a leading producer ofhardware or applications, I do
think and I've said this for awhile that if you're a producer
of the fundamental models,you've got a lot to worry about
because that's ahyper-competitive space.
There's almost nodifferentiation.

(11:39):
I already view it as acommodity.
I view the models as totallybeing commoditized.
I mean whether it's like whenwe're helping organizations
think through softwarearchitecture.
Picking the model is by far notthe highest concern, as it was
originally, because it used tobe.
You'd have to go after prettymuch the most performant model,

(11:59):
the fastest model, but alsoreally the smartest model.
More than that, and that's nolonger the thing you think about
, because these models arecomparable across the board and,
like we've talked about inrecent episodes, lama 3.3, 70
billion parameter version thatcame out in December is
comparable to GPT-4.0.
And it's unbelievable what youcan do even with that small

(12:21):
model.
So I guess the point here isthis class of models called
reasoning models.
They do a little bit more thanthe regular LLMs, but it's the
same exact thing is happening.
They just essentially have newtechniques that are being
incorporated in them.
So, whether it's out of Chinaor out of Africa, or out of
South America or out of someother place, we're going to have

(12:42):
more and more innovation comingfrom places that have far fewer
resources in the traditionalsense, but are innovative,
creative and are coming up withnew ways to do things.
Because the deep sea folksthere's a variety of things that
led to their breakthrough, butthey really have some really
smart approaches to theengineering under the hood that

(13:02):
were novel, that people outsideof their organization hadn't
thought of.
So I think there's a lot ofgood that comes from that.
Jan LaCoon, who's the head ofAI for Meta, was saying that
this isn't a victory of Chinaover the US, but it's a victory
of open source over closedsource, because what he's really
pointing to is that when youhave this massive sea of people

(13:23):
who are all collaborating andsharing, it's very hard for any
proprietary company, or allproprietary companies combined,
to compete with that, and Ithink he's right.

Mallory (13:33):
There's so much to unpack there.
Amit, let me start with.
You mentioned right at the topindependent verification that's
going on around this wholeprocess.
Some AI leaders right now arespeculating that this whole $6
million to train the model thingis not true or accurate.
As someone who has moreunderstanding of how that
process works for training amodel, do you feel like what

(13:54):
we're seeing is true, or do youthink they could just slap a
number on there and say sure itwas $6 million?

Amith (13:58):
Yeah, I mean, first of all, no one knows exactly what
they trained it on.
We don't even know exactly whatthe cluster size was, what the
chips were, we don't know howlong it trained for, so we don't
know how much money they spent.
So you have to take it as arethey telling the truth or not.
And even if you believe them,the question is what were the
inputs?
Because as models get better,they're predecessor models for

(14:19):
every model, and that lineagedirectly impacts how you're able
to train.
Newer, faster, smarter modelsLike LAMA 3.3 is dramatically
smarter than LAMA 3.1.
And so that's the same familyof models, but they're able to
use LAMA 3.1, the bigger model,the 405 billion parameter model,
to generate data that that wasused to help train the smaller

(14:42):
3.3 version, and that's aprocess called distillation, and
some people are claiming thatthat looks like something that,
at least in part, wasresponsible for DeepSeq's
performance, and one of thethings that was shown as
potential evidence of that butit's, of course, just a
screenshot is the DeepSeq R1model actually quoting OpenAI's

(15:02):
policy framework for why it canor cannot answer certain
questions, which is obviouslyindicative of having consumed
content that came out ofOpenAI's models?
Now, obviously, I don't know ifthat's true or not, but it
wouldn't surprise me if therewere techniques like that being
used with OpenAI's models, orprobably with Claude, probably
with other models as well,because, essentially, if the

(15:23):
mindset is, take whatever youcan possibly get your hands on
and use that to improve models,there are people who are going
to do that, whether you like itor don't like it, whether you
agree with it or don't agreewith it, it's going to happen.
Did it happen in this case, Idon't know, but it is happening
in general, and so it doesviolate the terms of service for
all these commercial modelproviders.
But the distillation, theconcept, is very powerful, right

(15:45):
, like if you try to put legalstructures around something to
prevent something that haseconomic reasoning behind it,
why it's happening?
It's going to happen.
The legal structures are goingto fall over immediately because
the power of the economicincentive is so great, and
that's what's happening here.
Distillation is a reallypowerful technique.
It's not the only technique,but it's something that

(16:06):
absolutely helped the R1 modelbe as good as it is Ultimately
for our association listeners inparticular.
A lot of that, aside from maybebeing interesting, trivia is not
necessarily useful, other thanto come back to the through line
and the trend, which is thatassociations, and everyone else
for that matter, can count onvery powerful models being

(16:26):
available for nearly free.
That is the thing you can counton, based on the competition,
based on the advancements, basedon the hardware going forward,
based on economies of scale.
It's just very obvious thatthese models are going to be
essentially nearly zero cost.
And so why is that importantfor you?
If you're an association CEO oranyone who's thinking about the

(16:49):
business side of theorganization, not the technology
, you might be saying well,wouldn't it be great if we could
do these things, if it was veryinexpensive, if GPT-4.0 didn't
cost so much that I couldn'tprocess all of my data with it
because that would cost memillions of dollars or it would
take too long?
Or the question of like well, Iwish I could do that with AI,

(17:10):
but I don't really want to sendmy data over to one of these
commercial providers.
I sure wish I could run these AImodels myself on hardware or
environments I control.
Well, you can do that, and youcan do it for fractions of what
it cost, even six months ago.
Well, you can do that, and youcan do it for fractions of what
it cost even six months ago.
So the reason that's soimportant to get your head
wrapped around is you need toplan your strategy based on what
will be available in the coming6 to 12 months, of course, but

(17:33):
even the next few years, andthis stuff is going to get more
powerful and cheaper, whichmeans there's more business
applications that you can counton having availability for you.
And a lot of people have thisfallacy where they say, oh, I
wish I could do AI, but I'm toosmall, I don't have the budget,
I don't have this, and that Allof those assumptions are false
now and they're going to be morefalse in the future.

Mallory (17:55):
So it sounds like the list of excuses is dwindling.

Amith (17:58):
Certainly from a cost perspective, and what that also
means is everyone else is goingto be adopting these
technologies, these techniquesand these strategies, and that
means the consumers of the worldare going to expect AI
everywhere.
Every CRM system is going tohave AI baked into it.
So if you're a CRM or an AMSvendor and if you don't have a
really deep and well-thought-outAI strategy that you are

(18:21):
scurrying about implementing asfast as possible, you better
watch out, because that's goingto be the expectation in those
types of tools.
It's going to be theexpectation in every piece of
software you use and everyvendor you deal with.
It's a software experience.
It's an e-commerce site or anapp.
The interaction between brandsand individuals is going to be
defined by the quality of theexperience, but also the absence

(18:42):
of friction, and friction issomething that be defined by the
quality of the experience.
But also the absence of frictionand friction is something
that's caused by constraints orchoke points.
Sometimes it's caused by design.
For example, you can't go buy aproduct from Hermes because
they don't want you to be ableto in order to create artificial
scarcity, which is a reallycool business model if you can
have that, but most people can'tdo that.
So in reality, friction isgenerally a really bad thing in

(19:04):
the consumer experience.
So what I would point to ishere we have a classical example
of the association saying poor,poor me, I'm too small and I'm
too under-resourced to handledoing anything meaningful with
AI, but nobody cares outside ofyour organization.
The reality is is your consumers, your members, your customers,

(19:24):
your users whatever you want tocall them are more and more
accustomed to a low friction,high quality environment which
is increasingly driven by AI, ifnot entirely driven by AI from
everything else in their lives,not just their professional
lives, but their personal lives.
So what I'm trying to say isreally simple, is that this
trend line affects the world andit affects the expectations of

(19:44):
the consumer, because AI will bein every single experience in
life and if you don't provide it, you're going to seem like you
don't.
It'll be as remarkable as ifsomeone didn't have a website
today.
You want to talk to a vendorand you want to buy something.
There's no website.
There's not even.
Maybe, maybe they don't evenhave a phone.
That's how far back in time youwill seem to be to your members

(20:06):
and to your customers if youdon't have, like, a deep AI
strategy.

Mallory (20:10):
I find it a bit ironic that this model was released
right after the TikTok ban,because we just saw one of the
most popular apps in the worldget banned in the US for only 12
hours, but still a surprisenonetheless.
Are you comfortable using aChinese AI model?
I mean, is that something thatyou're thinking about, amit?
Do you think it's safe to useSure?

Amith (20:31):
So I'm totally comfortable using the model so
long as it's not running inChina.
So if you go to DeepSeek, theirwebsite, if you download the
app from the App Store, from theAndroid Store, you are
connecting to servers that arecontrolled and ultimately the
data is going back to China.
Now I don't have any theory orknowledge that says Deep Seek is

(20:52):
controlled by the DCP or is not, but generally speaking, you
can assume that if your data isgoing to China, that it could be
accessed by government, itcould be used for purposes
outside of what the company evenintends.
So I don't know how much that'sactually happening versus not,
but it's absolutely apossibility.
So you have to be thoughtfulabout that, even as a consumer,
as a business, most certainlyyou have to be thoughtful about

(21:14):
where is your AI running?
So maybe compelling.
They actually have a freepricing plan for nonprofits as
well, as announced, and I don'tknow if that's totally free
forever at all levels ofinference, or if it's free only
for a certain level.
But that is a major, major trappotentially and again, I'm not
suggesting that the peoplebehind this have nefarious ideas

(21:35):
or plans.
They might be wonderful peoplewho have the best intentions in
the world, but they still livein an environment where,
ultimately, they don't havecontrol, so it's a situation you
have to be thoughtful about.
Now, as far as running the model, remember it's open source and
you can run the R1 model in alot of other places.
In fact, the folks over at Grokcreated a distillation of Lama

(21:56):
3.3 that distills parts ofessentially the R1 model into
Lama 3.3 that they took live ontheir website already, and
that's something worth checking.
That's not the R1 model.
You can inference the R1 modelitself.
In the United States there's anumber of providers that already
spun it up.
But on the Grok platform youcan inference on Lama 3.3

(22:17):
distilled with R1, whichessentially means they took Lama
3.3, then they got a bunch ofsample content out of the R1
model by running it over andover again, and then they
essentially did additionaltraining on Lama 3.3 to make it
smarter, using some of thebrainpower from R1.
And they essentially leveled upLama 3.3.
They gave it more reasoningskills, they gave it

(22:38):
capabilities.
So this is the power ofdistillation, right?
In this case it's two opensource models, so that's another
interesting thing to play with.
And on the Grok platformeverything goes 10x as fast as
it does anywhere else.
So that's worth checking out.
And that inference is well.
They have a data center inSaudi Arabia now as well, but
you can choose to inference onlyin the United States if you
want with them question.

(23:03):
Yes, you have to be verythoughtful about where the AI is
running and where you'resending your data.
It doesn't mean you can't usemodels that have a Chinese
provenance at all.
I don't think that's the rightway to think about it.
There's another great modelcalled Qen that's available and
the Qen 2.5 model is fantastic.
It outperforms a lot of othermodels.
It's kind of neck and neck withGPT-4.0.
They just released the new Qen2.5 Max, which is a bigger

(23:24):
version that's supposed to be inthe O1 category.
So there's a lot of competitionand stuff coming from China.
If it's open source, it's great.
Some people might have theoriesoh, it's going to phone home
somehow and transfer data.
That's easy to detect and it'seasy to firewall this stuff off.
So I wouldn't worry aboutanything like that and I would
also have a provider that runsthis for you that knows what

(23:45):
they're doing.
I wouldn't try to download ityourself and run it in your own
data center.
At least most of our listenerswho are in the association realm
, who don't have that in-houseexpertise, shouldn't do that.
But there's lots of options.
That's.
The beauty of this is that opensource models they separate the
decision of inference providerfrom model provider, whereas
prior to them, prior to thepoint where you had these

(24:07):
different models that can runanywhere, you picked OpenAI or
Cloud or someone like that, andyou said, okay, I'm going to use
, or Google, right, and you useone of those providers.
They are the model provider andthey also run the model on your
behalf.
But with open source models,you can run them anywhere, and
so that separation of concernscreates more competition and

(24:27):
creates more choice, which isultimately a wonderful thing.

Mallory (24:32):
So we're in agreement, right, that the reasoning based
performance of the model ispretty impressive lower cost.
So my question for you, amit,that I always like to ask is are
you considering running R1 inan environment for any of the
products that we have?

Amith (24:48):
Yeah, we've actually been thinking about it already and
we thought about O1 andpotentially O3 down the road for
certain tasks.
So we've talked a lot on thispod about agents and I want to
talk for a moment aboutreasoning models, just to
refresh our strawberry episode,our more recent episode, where I
think we talked a little bitabout O3 maybe, maybe a couple
of weeks ago I could behallucinating that as well and

(25:10):
then R1.
But these models, what they do,these so-called reasoning
models, really what they'redoing is it goes back to the
conversation we had onstrawberry with what we called
essentially the AI's equivalentof system one versus system two
thinking.
And going back to that workessentially in our brain, system
one thinking essentially isthat instinctive, immediate
reaction we have.

(25:30):
That doesn't really require anyhigher order processing, but
it's like someone says A, I sayB right, it's like this
instantaneous reaction.
You say one thing, someonereacts a different way.
It's like instinctive andnearly instant, whereas our
system two thinking is thatwhich requires some reflection,
it requires some reasoning.
Sometimes we take minutes, hours, days, weeks, months, years to
think through problems whenthey're more complex.

(25:50):
Machines haven't had thatcapability.
Machines haven't had theability themselves to say, oh,
that's an interesting problemand it's kind of complicated.
Let me think through that for aminute.
Let me check my work.
Now.
You can kind of approximatethat, even with older LLMs,
where you say think step by stepthat's so-called chain of
thought prompting, and itactually helps.
It's more of a trick than it isanything else.
It doesn't actually cause themodel to work differently.

(26:11):
But with reasoning modelsthey're actually trained to kind
of iterate.
So what they do is they say hmm, mallory has asked me to count
the number of R's in the wordstrawberry.
How should I do that?
Well, let me look at all theletters.
Let me see if these lettershave this.
Let me go count them.
Let me see if that.
And then they go and they checktheir work.
They say did I actually do whatMallory asked me to do?
So there's this kind ofiterative process where the

(26:32):
model is checking its own workand it's coming up with a plan.
Well, if you recall, when wetalk about agents and
multi-agentic softwarearchitectures, it's basically
the same idea where we're sayingtake a problem, break it down
into component parts, executethose parts, potentially be able
to loop back and redo some ofthem if there was failures in
any of them and then ultimatelycome to an answer and give you a

(26:53):
more complex output.
What's happened essentially isthe model is doing some of this
reasoning, which is essentiallythis iterative looping kind of
capability in the model itselfRight, and that's extremely
powerful because it makes iteasier for people to use.
But if you have a multi-agentarchitecture, it may not be the

(27:13):
right thing to use.
So a good example of this isour Skip software, which I'm
very excited about right nowbecause we're literally a
handful of business days awayfrom a soft launch to what we're
calling a private preview thana public preview of the new
version of Skip, which is crazynew and improved and incredibly
powerful.
But also it's going to be acompletely turnkey SaaS option.

(27:34):
So people will be able to go tothe website, click Get Started,
click a couple buttons toconnect Skip to their SaaS
solution, like Salesforce,hubspot etc.
Of choice.
A few minutes or longer later,depending on the size of the
database, skip will besynchronized.
You can start talking to skip.
So skip has been rebuilt.
Skip has only existed foraround 18 months since we

(27:55):
started building skip.
A little bit longer than that.
We've rebuilt skip and throwneverything away now four times,
and the reason is is that notonly does the architecture need
to change, but the underlyingmodels have changed, and it's
actually a little bit of anoverstatement to say we threw
everything out, but we threwaway lots of it, and that's
really important when things arechanging rapidly.
But in the case of what we havein the latest Skip architecture,

(28:17):
the way the architecture is setup is it's possible for Skip to
essentially do extraordinarilycomplex things for you.
Sometimes they take minutes todo, because Skip is able to
essentially look at what you'retalking about, take the whole
context of not only yourconversation but literally all
the data in your enterprise,look at all the types of data

(28:37):
you have in your Salesforcesystem or your HubSpot or
whatever, and then figure out aplan, execute the plan step by
step, which might require 3steps, 5 steps, 50 steps, and
then, after Skip executes allthose steps, to come back
through, check Skip's work andthen ultimately assemble the
final work product, which isalmost always some kind of
analytic or reports.

(28:57):
It's like a report that sayshey, here's my member churn for
the last three years, here's mypredicted member churn,
something like that.
So Skip is able to do thosethings because of these
techniques and, because of that,each of the individual steps
that exist within that kind ofagentic framework don't require
an R1 level capability, becausewe've broken down the problem
through the agent layer toessentially result in the LLMs

(29:21):
underneath it being much simpler.
Almost all of Skip now runs onLama 3.3, which is the 70
billion parameter model we'vetalked about, compared to
previously requiring OpenAI.
Openai was the only game in townlast year for Skip.
We needed the level ofhorsepower they had and nobody
else had a GPT-4O caliber model.
Now you have four or fiveoptions that are all really good

(29:41):
and Lama 3.3, again mentioningGrok with a Q.
We are using them exclusivelyfor inference.
We do go up from that a littlebit for a couple of key tasks
Kip does, but because of ourarchitecture, we can get a lot
of horsepower out of thesesmaller models.
Now, that being said, there aredefinitely situations where it'd
be helpful to have an R1 or anO1 or an O3 caliber model, and

(30:05):
our teams are always looking atthis stuff, always playing with
it and figuring out where youcan solve novel problems with it
.
So it's a lot of fun, but yeah,I mean, I think that people
tend to do this, they think, oh,I have to, now that R1's out, I
got to use it.
Well, actually, r1 and O1 arereally bad at certain things.
If you want a near instantresponse to a fairly simple
question that doesn't make anysense, you don't need it to go

(30:28):
through that whole thinkingprocess Like.
Gpt-4-0-l-l-m is perfectly goodat answering that and it
doesn't do any of those steps.
It's nearly instant,comparatively speaking.
So there's different things fordifferent problems.
Essentially, is what I'm tryingto say.

Mallory (30:40):
Yeah, that makes sense.
You mentioned a private previewfor Skip.
Is there any way for any of ourlisteners to be a part of that
group, if they're interested?

Amith (30:50):
They can ping me on LinkedIn.
We're taking just a very, verysmall number of people.
We already have a number ofpeople that are signed up for
the private preview.
That's only going to last ahandful of weeks as we get
through that, and then we'regoing to go to a public preview
and then we'll go to a fullrelease.
All those steps will happen inQ1.
So it's happening prettyquickly, but if anyone's
interested in that, they canping me on LinkedIn.

Mallory (31:10):
Awesome.
Moving to our second topic oftoday, we're talking OpenAI
Operator, which is agroundbreaking AI agent unveiled
on January 23rd of this year,designed to automate various
tasks by navigating the web likea human.
It can perform activities likeplanning vacations, filling out
forms, making restaurantreservations and ordering

(31:32):
groceries Really fun activities.
Operator is powered by a newmodel called Computer Using
Agent, or CUA, which combinesGPT-4-0's vision capabilities
with advanced reasoning throughreinforcement learning.
This allows Operator to see webpages through screenshots and
interact with them using avirtual mouse and keyboard

(31:52):
inputs, just as a human would.
What are some key features ofOperator?
Well, it can autonomouslybrowse websites to complete
tasks without requiring customAPI integrations.
It can leverage reasoningcapabilities to overcome
challenges and correct mistakeswhen it encounters difficulties.
It can hand control back to theuser for assistance, and users

(32:14):
can also add custom instructionsfor specific sites or save
prompts for repeated tasks.
Currently, operator isavailable exclusively to ChatGPT
Pro users, which right now is$200 per month through
operatorchatgptcom, but OpenAIdoes plan to expand access to
other tiers and integrate itdirectly into ChatGPT in the

(32:37):
future.
So, booking a reservation fortwo at a seafood restaurant,
finding a few tickets for aconcert, sending mom flowers on
her birthday.
These are a few examples thatthe OpenAI team shared in their
demo Amit.
I'm curious on your end if youtest this out and you realize it
works really well.
What are some everyday tasks,personal or professional, that

(32:59):
you could see yourself usingthis for?

Amith (33:02):
First of all, I got to get in my soapbox for a minute.
I think these guys really needsome help with picking examples.
Those are so dumb.
Every single one of those tasksis actually something I'd
really like to actually spend.
These guys really need somehelp with picking examples.
Those are so dumb Like everysingle one of those tasks is
actually something I'd reallylike to actually spend time on.
It's fun to pick a restaurant.
It's fun to like find you knowmusic that you're going to go to
, or I don't want anybody that Isend flowers to to know that an
AI did it, that's for sure.

(33:22):
Certainly my wife, you know.
So I don't know.
I think they need a little bitof help picking some examples
that are actually tasks that aregood ideas to automate.
But aside from that, the reasonthat they pick those things is
because they're good examples ofconsumer experiences that
happen in most people's lives atdifferent times.
But the point is this actually,the world around us we have

(33:43):
shaped for hundreds of thousandsof years in our image and to
work well, to have a userinterface that's intuitive for
people.
So the way things work in thephysical world works well for
the human form, which is whyhumanoid robots are such an
interesting area of productdevelopment, because if you can
plug a robot in to the humanworld, the robot has way more

(34:06):
utility than if it's only ableto interact in very narrow
ranges of environments.
So, similarly, if an AI in thedigital world can interact
through human interfaces orinterfaces designed for humans,
which would include, of course,websites, but also desktop
software, the crusty old AMSthat you don't like very much,

(34:27):
all those products that are outthere, right, if an AI could
learn those desktop softwaretools of various flavors or
mobile tools or whatever.
That's really interestingbecause then the AI is able to
actually literally do all thedigital labor that a person
would do through a keyboard andmouse.
And we know that AIs can see,we know they can speak, we know

(34:47):
they can hear, now they can typeand now they can move the mouse
, and so this is a veryinteresting thing.
I want to point to another rivallab of open AIs called
Anthropic, which is the maker ofClod.
Last year they revealedsomething very similar to this
called computer use, which isavailable in Clod, I think also
for their premium tier.
I personally haven't usedeither of these, but the idea is

(35:11):
the same.
Now the way they've implementedit is different.
In the case of Claude'scomputer use, it controls your
local desktop, so you'reactually handing over the reins
of your computer to the AI,which I personally find very
scary the idea that the AI wouldget into files or send an email
on my behalf or just somethinggone awry right, like even
malware type scenarios.

(35:31):
But what OpenAI did, I think,is really smart, that they have
a different approach.
The way Operator works is itdoes not control your computer,
it controls a remote browser,and that remote browser runs in
an environment where you can seewhat it's doing and you can
interrupt it at any time.
It can ask you to help at anytime.
So it's kind of like you knowyou're the co-pilot, right,

(36:03):
instead of you using a co-pilot.
The AI is doing the work andyou're the co-pilot.
Effectively, it flips the roles.
But that's kind of cool becausein that context, there's less
risk because you're not loggedin to any website.
So, like with OpenTable, theydidn't really talk about this in
their demo when they releasedthe operator agent.
But how was that developer whowas demoing it logged in?
Well, someone has to log in.
Of course, they omitted thatfrom the demo, which I
understand, but the reality isthere's downsides to all this

(36:24):
stuff.
I think this is brand spankingnew.
So I personally haven't used itbecause I don't have a lot of
utility for it immediately.
Plus, I'm a little bit nervousabout it, to be perfectly honest
, just in terms of cyber risk.
But, that being said, I do seeuse cases which we can talk
about.
But that's kind of my openingsalvo and I think it's a really
important innovation because ofthe kind of the human interface

(36:46):
thing I was starting with, right, like robots in the physical
world, being able to like, meldinto the whole world.
That's designed for us, and thesame thing for computers.
Computers are designed for usto use.
It's incredibly inefficient forone computer to talk to another
through a user interfacedesigned for people.
It's kind of silly, actually,because APIs are a thousand

(37:07):
times faster and more efficient,but the world of websites and
applications is probably tens ofthousands of times bigger than
the number of things you can dothrough API.
So it just opens up a lot ofpossibilities for automation and
legacy systems.
I was kind of kidding about thecrusty old AMS comment, but
kind of not, because if youcould have the bot actually do

(37:27):
some of the stuff with your oldsystem which maybe people don't
like using, maybe it frees themup and they don't really care
about replacing the old AMS asmuch anymore because you've
automated it.

Mallory (37:38):
Well, I will say crusty old AMS is probably going to be
one of my top ranking phrasesthat's ever come out of this
podcast.
I'm going to chuckle about thatlater tonight.
I think some of our listenersmight be surprised at me to hear
that this is something you're abit more weary about.
I feel like generally you're anoptimist right, you're realist
as well but generally you don'tseem to be overly cautious with

(37:59):
any of these things.
New release is not saying thatyou are now, but I'm curious if
you are cautious about this, areyou suggesting that our
listeners be as well, that thisis not something they should go
out and try?

Amith (38:12):
Well.
So I'm also a cheapentrepreneur, so paying OpenAI
$200 this month is not somethingI'm willing to do right now.
I just don't see enough value.
I don't really care about usingSora personally, and this tool
isn't enough.
It's not meaningful enough forme to go try it out.
Plus, I'm kind of busy withother things.
So if I had a little more freetime I might go play with it and
pay him for a month, but Ireally don't think it's that
interesting because I understandhow it works and I think

(38:34):
everyone's going to have this.
There's actually already opensource versions of Operator that
you can download and inferencelocally with Lama, which are
worth checking out as well.
That's a little bit moreinvolved in setting it up, but
you can run operator-like thingsand that's actually existed for
months.
It's not something thathappened right after OpenAI came
out with Operator, so there'sways to do this.
I also think it's super, superearly infancy days for this

(38:55):
technology.
It's really crude, it's reallyslow.
I don't think the utility issuper high, but like I was
saying earlier about R1 and howthe predecessors to it and
what's going to happen later areon this crazy timescale, the
same thing is going to happenwith this by the end of this
year.
These kinds of agents will bevery common and they will work
quite well and they'll be veryfast.

(39:17):
So, rather than moving at thislike pokey pace that you saw in
the demo, where it's like, oh,that's what operator does?
That operator really doesn'tknow how to use the computer too
well, right, maybe we shouldsend them to, like a Mavis
Beacon teaches typing class,because that operator kind of
sucks at using the keyboard.
But what I'm serious about,though, is that that's because
the AI is so slow right now andit's so resource intensive to do

(39:38):
.
What it's doing is essentiallytaking a screenshot multiple
times per second, running thescreenshot through the model and
then getting back JSON andinstructions, essentially to
tell it what to do and whichpixel to move the mouse to and
what button to click, and allthat kind of stuff.
It's incredibly inefficient,but it's a great concept, and so
over the course of the next sixmonths, we'll get another AI
doubling, another one the secondhalf of the year, and probably

(40:00):
by the end of the year.
These things will be quite good.
So I'll jump in my commentabout cybersecurity.
Risk is actually mainly withthe computer use and anthropics
model running on my localcomputer.
That risk doesn't exist in theopen AI example, because you're
running in a remote browser.
At least it's not the same kindof risk.

Mallory (40:16):
Yeah, it just seems like this is one of the first
steps we've seen recently foragentic solutions or agent-like
actions, in terms of actuallybeing able to do things on your
computer in consumer-gradetechnology.
Would you agree with that?

Amith (40:31):
Yeah, totally.
I mean the examples people use.
They're trying to make itrelatable as well, which is why
they use the how to automatesending your mom a birthday gift
, which I think is just thefunniest example.
But it's crazy.
But that concept is relatable,I guess, and we're talking about
it.
So maybe they knew all alongthat it would be something
people would laugh at.

(40:51):
But the point I would make isthere's a lot of business use
cases for this.
So think about this you get anemail and the email says I'd
like to renew my membership, andyou in the member services
assistant or member servicesspecialist role has to do that.
So what do you do?
Well, you have the email openin one screen and then you open
up your AMS crusty or not in theother screen and then you do

(41:16):
the stuff right.
You open up, you search for themember, you swear at the
computer because it couldn'tfind the member based on the
email or the phone number, andthen you finally find the member
and then you go into the recordand then you figure out how to
do the renewal, whatever thatprocess is in that system.
So there's like anywhere fromthree to 15 steps that you go
through in the AMS.
Well, what if you could just sayhey, operator, watch me do this

(41:38):
.
This is what happens a lot?
I get these emails and this iswhat I need to do in the AMS.
This is the AMS system I use,so this is what I want to do and
use.
So this is what I want you todo.
And the operator says, yep, gotit.
I now have been trained justlike a new employee and the next
time an email comes in likethat, you can just click on it
and say set it to operator andoperator takes care of it.
It sounds a little bit sci-fiand it wouldn't work well at all

(42:00):
right now if you tried to dothat with operator.
It's way too simplistic of anagent today, but that's not true
in 6 or 12 months, certainly ina couple of years.
So, yes, you should totally bethinking about this stuff as a
way to radically change the wayyour workflow is, because those
types of things that require somuch manual effort and take up
as a percentage of your staff'stime that kind of task I just

(42:21):
mentioned somewhat jokingly butseriously it's like that
probably chews up half of thepeople's time in 50% of the
positions in your organizations.
Like 25% of your staffingbudget goes towards those kinds
of tasks that could probably beautomated.
Think about what you could dowith getting that time back.

Mallory (42:38):
And I would say too, if you're a leader listening to
this podcast, you've got to takea stance on this, you've got to
educate your team on this,because what you don't want to
happen is this technology to geta little bit better and maybe
your staff goes off and theystart using operator and maybe
they're not disclosing that andmaybe operator does exactly what
Amit says makes a mistake.
And then your staff comes toyou and say well, I used

(42:59):
operator, so am I in trouble?
Is operator in trouble?
What's our staff usage policyaround this?
I would say this is somethingyou want to get ahead of for
sure.

Amith (43:08):
Yeah, I would add to that .
I think policy and guidelinesare key, because you have really
three categories of people thatI've seen and this isn't just
for this technology, but it'sespecially true.
You have people that are doingnothing and probably actually
very happy to do nothing.
And then you have people whoreally want to do something but
aren't doing anything becausethere's no policy to help them
know that it's the right or thewrong thing to do.

(43:29):
And then you have people whoare just going to do whatever
the hell they want, regardlessof what your policy says, and
your policy is probably notgoing to affect them, and that
they exist in every organization, and actually sometimes those
people are really helpful if youcan get them to say, hey,
what's working, what's not.
But in any event, without apolicy without guidelines,
without training as well,without a policy without
guidelines, without training aswell, you're leaving people in
the dark, and so I agreewholeheartedly with what you're

(43:52):
saying, mallory.
I think I talk to people about alot of different technologies,
a lot of different AI models andsoftware architectures and
business strategies and blah,blah, blah, blah blah, but every
single time, the conversationis all right, where do we get
started?
And it's always the same thing.
Get some policies andguidelines in place.
Don't spend a million years onit.
Don't go and convene a specialpanel of 300 of your favorite
members to talk about what AIshould look like and spend the

(44:15):
next six months and amend yourbylaws.
Don't do that.
Just create something simplereally quickly and tell people
it's going to change constantlybecause AI is changing fast.
Roll it out and then invest insome training for yourself and
for your team.
It can be sidecars, it can beanyone else's.
Just go get trained and get aguideline in place and you'll be
dramatically better off.

Mallory (44:37):
Yeah, and guidelines aren't always meant to stifle
people's innovation by any means.
They can actually give people alot of freedom by creating
those parameters that they canact within.

Amith (44:46):
Yep.

Mallory (44:47):
Well, amit, it was a good, jam-packed episode.
You were on your comedy game.
We talked about R1, crusty oldAMS's operator.
Lots of good stuff.
Thanks for tuning in to ouraudience and we will see you all
next week.

Amith (45:03):
Thanks for tuning in to Sidecar Sync this week.
Looking to dive deeper?
Download your free copy of ournew book Ascend Unlocking the
Power of AI for Associations atascendbookorg.
It's packed with insights topower your association's journey
with AI.
And remember Sidecar is herewith more resources, from
webinars to bootcamps, to helpyou stay ahead in the

(45:25):
association world.
We'll catch you in the nextepisode.
Until then, keep learning, keepgrowing and keep disrupting.
Advertise With Us

Popular Podcasts

Are You A Charlotte?

Are You A Charlotte?

In 1997, actress Kristin Davis’ life was forever changed when she took on the role of Charlotte York in Sex and the City. As we watched Carrie, Samantha, Miranda and Charlotte navigate relationships in NYC, the show helped push once unacceptable conversation topics out of the shadows and altered the narrative around women and sex. We all saw ourselves in them as they searched for fulfillment in life, sex and friendships. Now, Kristin Davis wants to connect with you, the fans, and share untold stories and all the behind the scenes. Together, with Kristin and special guests, what will begin with Sex and the City will evolve into talks about themes that are still so relevant today. "Are you a Charlotte?" is much more than just rewatching this beloved show, it brings the past and the present together as we talk with heart, humor and of course some optimism.

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.