Transforming AI Agents into Reliable Enterprise Assistants (Tomas Gogar)

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Andreas Welsch (00:00):
Today we'll talk about the three steps to make AI
agents enterprise ready, and whobetter to talk about it than
someone who's actively workingwith that Tomas Gogar.
Hey Tomas.
Thank you so much for joining.

Tomas Gogar (00:10):
Hello, Andreas.
Thanks for having me.

Fantastic.
Hey, why don't you tell us alittle bit about yourself, who
you are and what you do.

Okay.
I'm Tomas Gogar.
I'm currently CEO and one of thefounders of Rossum.
Computer scientist, AIscientist.
Originally, we started thecompany during I met the, a
co-founders in the same AI PhDlab.
So really, from the AI spacebefore it was cool.

(00:41):
And also we focus on automatingthe paperwork for large
enterprises.

I've, done quite a bit of work in that space
before, and I'm always surprisedhow much paper there is in a
business.
I'm excited to see that you'resolving a real problem and a
real need, and I'm also excitedto have you on, the show today.
And by the way, thanks to theteam at Rossum for sponsoring
today's episode.
Why don't we play a little gameto kick things off?
What do you say?

Let's do it.

Okay, perfect.
So this one is called In YourOwn Words, and when I press the
buzzer, the wheels will startspinning when they stop.
I would love to hear youranswer.
You have about 60 seconds forit.
And let me know what's the firstthing that comes to mind.
So are you ready for, what's thebuzz?
Am.

(01:28):
That's it.
Okay, perfect.
Here we go.
If AI were a vehicle, what wouldit be?
60 seconds on the clock go.

I think it would be, definitely electric car.
Exciting, huge opportunity, butstill in major in certain ways.
Yeah.
So yeah and, basically I thinkit's very accurate.

(02:03):
The, all manufacturers focusingon that, knowing that this is
the direction.
And big race.
How to make it.
How to make it real andimpactful.

I love that.
And in indeed it seems like it'sthis big race.
If you follow the news and theheadlines it's crazy.
I don't know how people are,holding on or if, it even still
makes sense to hold on or if youjust accept that there's so much
stuff flying by.
But anyways.
Great answer.

(02:38):
But that brings us to the topicof our show to today, making AI
agents enterprise ready and,building them in ways that are
enterprise great.
This term of enterprisereadiness can mean many
different things to, differentpeople in, if you follow the,
headlines in, the news, it seemslike AI agents obviously are the
big thing of 2025.
Everybody's doing it,everybody's been doing it for a

(02:59):
long time, whether it's vendorsor it's companies and
practitioners.
I think when you peel it back,it's a lot more nuanced.
And just because you can run anagent on, your own machine or
deploy it somewhere in, in thecloud and, tinker around with,
it doesn't necessarily mean it'senterprise greater, is it ready
for production?
But I know you guys at raso havebeen building AI agents and have

(03:24):
just come out with the firstone.
So I'm curious as a softwareleader and as a founder, what
are you seeing there?
What makes agents enterprisegreat?
What do you need to get right ifyou're building them?

It's a, great question.
Look, what we are seeing aroundis a tremendous hype, and I
believe that the hype comes fromthe fact that you can build
demos extremely quickly, right?
Like really good looking demoscan, you can take an agent, the

(03:59):
framework and build an agentwith.
Having an LLM deciding andplanning and do a lot of things
and it looks extremely cool.
However, the challenge is thatit, looks cool in a demo, but
what we are seeing is usuallywhen you, try to deploy it in

(04:21):
the real life, productionscenarios, there, there is a lot
of situations where it doesn'twork well.
And the challenge is thatusually with the current state
of LLMs, if you let the LLMdrive the workflow of that

(04:42):
agent, it's unpredictable,right?
Because of the nature of howLLMs are designed, how they
work, they are non determinedmistake and, they can
hallucinate.
So the challenge is that itlooks cool.
Usually works well in 70% ofcases and it can have disastrous
results in the 30%.

(05:03):
And the challenge is thatusually you don't know which 30
person will go wrong.
So usually it's great when, soit, what we are seeing around on
LinkedIn and on social media arethose like.
Pre recording demos, becausesomebody just recorded the
scenario where it went well orcherry picked situations where

(05:24):
it actually works.
But for enterprise readiness, itneeds to be reliable.
In the job in which you want touse the agent.
And that reliability comes fromcouple of properties.

(05:46):
And I believe that there are acouple of, for example, the
agents need to understand thelimits of their knowledge.
If you are an employee advertand you don't know, you are not
certain, you have a way how toescalate to your boss how to ask
your colleagues.
But in order to escalate, youneed to understand limits of

(06:07):
your knowledge.
And that's where LMS aretremendously bad, because.
They are like those studentsthat always try to answer.
I think understanding the limitsof the knowledge is one of the
things then obviously you needto have certain types of

(06:29):
guardrails, for that agent toexecute.
And we know those guardrailsfrom.
Standard corporate world, likeI'm a CEO of a company and I
cannot go and, increase mysalary tomorrow, like on my own.
I don't have, I don't haveaccess to that system and I

(06:50):
probably will need someapprovals from, let's say, my
board members, right?
So, there is governance.
I don't have, I'm a CEO andco-founder of the company and I
cannot access the data of ourcustomers for very good reasons.
So.
there are limits to us as humansand humans, our agents as well.
And, we need to build a systemthat that, AI agents have those,

(07:15):
limits and guardrails aroundthem as well.
So that, that could be thesecond one.
And the third one is really justhaving the expertise of the task
that you are doing.
And the expertise needs to befactual as well as procedural.
How it is done in my companyemployees know this, so the AI

(07:38):
agents need to copy the humanemployees, I believe with all
their skills before they arereally enterprise grade.

I have a question there because I'm,
really curious.
For, a long time we said, let'snot compare AI and humans, let's
not and, prop our size now withagents.
I also go into this mode a lotof times and say people in a
business are expected to do X,Y, Z, and like I said, they're.

(08:08):
They're expected to know whatthey know and figure out what
are the things they don't knowand where they need help.
They need to read the code ofconduct.
They need to know things likeIFRS and other accounting
standards.
Agents will eventually need tohave some kind of knowledge or
similar knowledge too.
So comparing humans and AI arewe blurring the lines and is

(08:31):
that helpful or is that harmfulin, some ways?

I, think that building this one-to-one
relationship and like loggingthem, I think that to really
deploy them on scale, at scale,for the corporate structures,
it's actually helpful if wemodel an AI agent as a

(08:59):
replacement of a human.
Okay.
For example, but there might bedifferent setups.
So imagine for me if I have anAI agent that needs to be
supervised because of, let'ssay, a lack of the confidence
skills, and, I can still use anAI agent as a personal assistant

(09:21):
to every employee.
That's fine.
So we can suddenly say, Hey,every employee has, an AI
assistant.
Good, but whenever we will tryto have those AI agents act
autonomously, because that'swhat I believe is the holy
grail.
The best way how to plug them tocorporate structure is actually
to have that as a, virtualemployee.

(09:46):
That allows you, you will notchange your corporate structure
just because of AI agents.
You need to plug them somewhere,let's say, in certain processes.
And the best way to do that isto do that in this way, like as
a, replacement or widening yourteam.

Yes.
With that, that makes a lot ofsense.
That, that resonates.
So we, we, have these familiarstructures of, how teams and
organizations and companies are,set up and it, helps us find,
use similar concepts now with.
These pieces of software thatcan do more than just follow

(10:20):
basic rules.
Now there was something elsethat you mentioned around we
need to teach them or tell themwhat are the, limits of, the
knowledge in putting guardrails.
How do you do that?
We've, or probably many of ushave seen these basic prompts.
You're a helpful customerservice agent, and you do X, Y,

(10:42):
Z and, if you don't know, thenyou escalate.
Is that how you do it?
Or how, do you put theseguardrails in, place?
What does it mean?
So

some, I hear you.
Some, guardrails are simplylike.
I don't wanna say physicalbecause they are actually
virtual, but I simply cannot docertain things.
I don't have the access rightsto this file.
I don't have, user rights to dosomething in my SAP instance,

(11:11):
right?
So very similar guardrails.
To human employees, but thatneeds to be executed on the,
let's say, up layer, the APIlayer that the agent is using,
right?
However, what I was talkingabout before is the, limit of my
knowledge comes from the abilityto estimate my own confidence in

(11:33):
my decision, and that issomething that cannot be solved
by the prompt.
It can be.
And, we've proved that manytimes here at Rosson.
But for that, we need to changehow LLMs are actually working on

(11:57):
the next generation of Gen aibecause.
The design of LLMs simplydoesn't allow you to come up
with something like, calibratedconfidence course.
If I would be, if I try toexplain it in technical terms
because it's it just tries togenerate the answer, but doesn't

(12:18):
assess the, it's confidence inthe answer.
The fact if you try to promptit, tell me how confident you
are it will not work.
It's not what I'm talking about.

Awesome.
Thank you for sharing that.
I think that, that makes it alot more tangible in seeing that
it does come back to basicprinciples of security of
architecture and design.
And you simply don't have accessto this data.
I.
Duh.
Sounds obvious, but when youtalk about you need to have
guardrails in place or you needto tell her what to do and what

(12:51):
not to do, I think then it'smore nebulous.
So, great to see that it can beas, as simple and as tangible
as, that.
And, by the way I'm looking atthe chats where folks joining
from Luxembourg, from Dubai.
From many other places.
So excited to, to have you withus.
Also, if you have a questionfor, Tamas, please put it in the
chat and we'll pick it up in aminute or two.

(13:11):
That's one of the benefits ofhaving this as a live stream
engaging with both the guest andthe audience.
Now we've talked about.
Agents being similar toemployees at a minimum on a
conceptual level.
As we think about these these,new capabilities, what does that

(13:32):
actually mean?
And, what does it look like on aday-to-day basis from maybe
where we are today to where youare seeing things go in the next
couple of quarters or the nextyear or two years.
How will people work with thesetechnologies?

Every company defines it differently via
rossum.
Define AI agent as a virtualcolleague you can delegate work
to.
Okay.
Very simple definition.
And you typically delegate thework to a colleague using
virtual language or.

(14:09):
In more structured processes bysharing standard operating
procedures and telling them,Hey, this is how this particular
job is being done here.
So you delegate the work and youexpect them to finish, to do
their job ends to end.

(14:29):
And when they are not exactly,when they are not confident
enough, you expect them toescalate to their manager.
I do believe, and we are seeingit, let's say in our deployment,
is that you will have corporatestructures there, you have a
manager.
And that manager has let's saysome, direct reports, which can

(14:54):
be either human or AI agents.
And and the way it can work isthat the AI agents will do their
part of the job.
It can be anything in our caseat Frost, for example processing
some, paperwork tasks.
And when they are not confidentenough, they, let's say in the

(15:17):
first instance, escalate totheir human colleague, to their
human peer, right?
So you can imagine a clerk.
That an AI clerk that is doingsome, paperwork job and if not
confident enough, it escalatesto human clerk.
And if the human clerk is notenough, then they escalate to,

(15:41):
to, to their manager.
And if the manager is notconfident enough, they can
escalate even higher.
So I do see them working side byside as, and as an outcome of
that, you will have.
You'll have smaller teams thatcan actually handle more work,
as a team.

(16:01):
But, I do see that in thefuture, in the close future
couple of quarter or a year,it'll be more like a mixed team.
That's what we are seeing in ourdeployments, let's say that the
AI is able to handle 60, 70% ofthe cases.

(16:23):
On its own if it can, itescalates to human colleagues.
It learns from them as well.
And there are cases when evenhuman colleague doesn't know and
then they escalate to themanager.
So I, I see them working side byside.
Yeah.

Now how, do you see that work across different
departments?
So maybe you have a sales orderthat needs to be scanned, or you
have a production order, or youhave a bill of lading in, in
their different departments.
How.
How does that work?
Are agents going to collaboratewith each other and maybe even

(16:59):
on, on the same platform if theyknow I have an agent here for
sales and I have one formanufacturing and planning.
Do you en envision themcollaborating and figuring
things out?

Yeah, sure.
I think, it's needed because Ithink the premise is that.
The expertise, right?
Like I'm an agent that does Xand part of my job description
might be if something happens, Ineed to involve some other team
members.
And I think that the, basic proor premise should be that I

(17:33):
don't know who it'll be.
Will it be a human or anotheragent?
I don't know.
And, that's the way to, to buildit, because that's the way how
you can create dually scale it.
I think it's very, I think herethe analogy to self-driving cars
works pretty well.
We could build self-driving carsby rebuilding the whole

(17:56):
infrastructure, changing ourroads, changing our crossroads,
but it's huge infrastructuralchange that would need to happen
overnight to really have animpact.
Impossible.
So instead of that.
We use intelligent agents,intelligent cars that can use
existing infrastructure and canlift next to the cars driven by

(18:18):
humans.
And you don't care because wemight be driving behind each
other on the highway and one ofus will not be driving to be
self-driving car, but the otherone doesn't know.
So I think that.
This is the way how you cangradually deploy.
And the self-driving car analogyworks here as well.

(18:39):
Because it's first it works onthe highway or during parking,
and later on it works.
In a small Italy town where wedriving is extremely difficult.
So I think it's needed forgradual deployment to kind copy
the corporate structure.
And not assume if the jobdescription involves

(19:03):
communicating with somebody,being internally or externally.
You cannot assume whether it'llbe a human or another AI agents
pretending and communicatinglike you.

I really like that analogy to self-driving
cars and the similar conditionsthat exist, right?
And as long as they're safe ormaybe even safer than a human
driver.
That meets the standards thatthat authorities and,
departments of transportationand so on set so similarly,
right?
If, we know that agents willbehave, a according to certain

(19:36):
rules or that they don't gocrazy or, break out of their,
little box, right?
That, that analogy works reallywell.
One other thing that that, I'mwondering there is do we need.
Like a second address book foragents.
Today if I'm in a business andI'm stuck.

(19:58):
I know I need somebody inprocurement.
Maybe I'm new to theorganization.
It's a larger company.
I don't know everybodypersonally.
I can ask around.
You can say I.
Who, who should I talk to forthis procurement question?
Maybe they, point me to theright person, or they point me
to a generic inbox or submit aticket or something like that.
But eventually I get to aperson, next time I have a
question, I call this persondirectly.

(20:18):
I send them an email and say,Hey, you were able to help me
last time.
Can you help me again?
Do we need something like thatfor agents too?
Kinda like a, registry, aregister, or like an address
book to know this agent can workwith that one, but don't use
this.
That sort of thing.
There are some protocols in,standards being proposed in by
some of the vendors.

(20:39):
How do you see that working,that agents even know that there
are other agents surroundsurrounding them?

I think the only difference to humans is that
agents will be better in usingAPIs.
It's unnatural for us.
But at the same time, I dobelieve that they will also need
to be able to use the channelsthat the humans are using.
Exactly.
Because of this co-existence.

(21:11):
There will be a phase where wewill co-exist next to each
other.
We see it in the paperwork jobs.
So you need to have, obviouslyit can it's, good in calling
APIs and for the APIs itprobably needs an address book.

(21:32):
But for non API ishcommunication and collaboration,
it needs to have, userinterfaces and address books
like humans do.

Yeah, that makes sense.
Yeah.
Thanks for sharing.
I'm looking at the chat and Isee Karina put two messages
there.
One saying, Hey, agents don'tthink or understand their tools
for specific tasks.
And when we know thoselimitations of these tools, we
can develop safeguards and iteliminate them.
So validation against trusteddatabases, methods for

(22:09):
detecting.
When an LLM is operating outsideits training domain and so on
how do you approach that atRossum?
I think security was one pointthat you mentioned, but are
there maybe others?

Yeah, look, guardrails are easy, right?
The physical guardrails, youdon't have access to this, you
have access to that's easy.
But we rossum, because we arenarrowly focused right on
specific type of supply chainpaperwork.
We've built a different type ofLL lab.

(22:41):
We call it TLLM, transactionallarge language model that is
trained specifically to processtransactional paperwork.
Because it's such a narrowdomain, we were able to build it
in a way that it can actually,assess its own confidence
reliably.
So our the brain behind Rossumhas the ability to assess its

(23:05):
confidence and then it's notconfident enough it has the
humans.
So that's a big safeguard, likea big one.
We are now, we are now releasingfunctionality that is
specifically designed like areasoning functionality, which
is able to intelligentlyinterpret data on the documents,

(23:30):
but again, with ability toassess its confidence.
But this is not easy and we areable to do that just because our
domain is fairly narrow.
It's not known currently in the,research community how to do
that in a generic gen AI setup.

(23:52):
Our paperwork task is simplerand that's why it allows us to
build type of AI that is able todo that.
But in traditional lms, wecurrently, nobody has that.
And that's I think that's thebig next holy grail to chase.

(24:14):
If we really want to live inthat dream where we have really,
truly autonomous AI agents thatare also very generic
generalistic.

No that's, exciting.
I love how you described that,right?
And, because you're stayingfocused, you're able to do that
on, a very specific task and doit really, well.
Now we're getting close to theend of the show and I was
wondering if you can summarizethe key three takeaways for our
audience today to take away fromtoday's episode

The three main takeaways.
At first.
Number one, like just be a bitskeptical to the overhyped AI
demos that you see out there.
Demo is usually, especially inthe current age of Jane ai it,
can be very, far from the realproduction usage.

(25:03):
If you are deploying this AIagents, choose wisely where we
are deploying them either in asetup where you will have a
human supervision.
Or in a setup where the AIprovider can guarantee the sense
of confidence, before you reallylet it run autonomously.

(25:29):
And I think the third one, likeat the same time let's, for, and
that goes to the AI vendors.
Let's keep innovating, thefundamentals of ai.
Let's not just rely on thecurrent LLMs because there is so
much they can't do and will notbe able to do.

(25:50):
True.

Awesome.
I love that.
Maybe before we really wrap upfor if we good, I see there's
one more question in the chataround whether in this agent AI
era, we will still need to havea graphic user interface or.
How that will work?
Or are we going back to Ms.
Does and putting our commands ina command prompt?
Where do you see that going?

I think the answer is both.
There will be a lot of real isthat there will be a lot of
vendors that will be fightingthis change and we'll close
their APIs and so on.
So we will, we'll probably haveboth API way as well as the user
interface way.

Alright, Thomas, thank you so much for your
insight.
It was a pleasure having you onand hearing and learning about
how you are approaching this andwhere you're seeing the
opportunities with agent ai andalso how you make sure that it
stays relevant and, responsibleas well.
So thank you so much.

Thank you, Andreas.

All Episodes

Episode Transcript

Popular Podcasts

United States of Kennedy

Dateline NBC

Bookmarked by Reese's Book Club

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Transforming AI Agents into Reliable Enterprise Assistants (Tomas Gogar)

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}United States of Kennedy

Dateline NBC

Bookmarked by Reese's Book Club

All Episodes

Transforming AI Agents into Reliable Enterprise Assistants (Tomas Gogar)

United States of Kennedy