#230 - How To Make Your AI Less Chatty (with Sounil Yu) - CISO Tradecraft®

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Hey, if you're implementing MicrosoftCopilot, it's a really great productivity
tool, but you might be introducingsome risks that are unnecessary.
Make sure you listen to this episode andwe're gonna tell you how to address them.

(00:23):
hello and welcome to another episodeof CISO Tradecraft, the podcast that
provides you with the information,knowledge, and wisdom to be a more
effective cybersecurity leader.
My name is G Mark Hardy.
I'm your host for today, and Ihave Sounil Yu, who has been on our
show at least twice before, and I'mwelcomed back to the show, Sounil.
We're gonna be talking about somethingother than the cyber defense matrix.
We're gonna be talking aboutMicrosoft copilot and security

(00:45):
and some of the issues.
So welcome to the show.
Thanks for having me again, G Mark.
it's great to have you here.
Now, for those who aren'tfamiliar with Sounil Yu, he's
been around for quite a while.
He was one of the mentors at Mach37 when I was doing my startup in
2014, and it's been a while now.
So we've been, we've known eachother for at least those 11 years.

(01:06):
And during that time he's made alot of substantive contributions
to the industry, including yourstartup here, which you have Knostic.
And I remember when I ran intoyou at Black Hat last year,
they had a competition forstartups and things like that.
And as I recall, they kinda liked you.
Yeah.

(01:26):
it, and, we have the privilege ofalso being at, as a part of the, RSA
innovation sandbox this year as well.
we're recording thisbefore the actual sandbox.
hopefully we'll have great newswhen this actually airs, but, either
way it's just getting in is a,accomplishment in, and of itself.
Yeah, I thought you had anumber one startup for black
hat or something like that.

(01:47):
Wasn't that it?
Yeah.
Yeah, it was the, we won the, basicallytheir version of Innovation Sandbox.
Got it.
Okay.
So obviously there's more to this thanjust Sounil with another great idea.
A lot of people look at and said,yeah, he's onto something, or at least
you and your team are onto something.
So essentially what we'regonna be talking about today
is the issue, as I said in the.
Teaser about Microsoft Copilot.

(02:07):
It's a great tool.
It's does a lot of amazing things, butit does potentially introduce a number
of risks to the enterprise that can bemitigated if you know what you're doing.
And Sounil and his team over at Knostichave been looking at this for, well over a
year now and have come up with some, whatI think are some pretty amazing ideas and
some really wonderful opportunities to goahead and for us as security professionals

(02:30):
to address both the issue of productivityand capability to say, Hey, this is
okay, but here's how we can go ahead andput in some security on the downside.
Again, as we always say in security,you don't wanna be the department of no,
you wanna be the department of how, andthis is gonna be one of those episodes
that shows you how to do it right.

(02:51):
So for those of us who are not totallyfamiliar with Microsoft Copilot, or
we think we know about it, but we'renot quite sure what it really is,
tell me a little bit about copilot.
What does it do and does itintegrate into Microsoft 365?
And why would somebodyimplement it in the first place?
so first problem is that Microsoftis using the word copilot for just
about everything that is out there.

(03:13):
So there's some confusion aroundwhich copilot we're referring to.
I definitely can, appreciate that,confusion because I, for a while I was
confused as to what to even call it.
but the problem, the, particularcopilot that we're talking about
is called Microsoft 365 Copilot.
That's different than,Microsoft 365 Copilot chat.

(03:36):
Thank you.
Microsoft.
And the, main difference is oneis grounded upon your, internal
enterprise data that's, stored onthings like SharePoint and OneDrive.
the other one, the one that's calledchat is I think ChatGPT plus the
subscription, service that you can getat chat GPT, but host it within Microsoft

(04:00):
and, having the, and it's basically partof your Microsoft 365, subscription.
So anyway, the one that we're talkingabout as a potential concern is the one,
is the Microsoft 365 copilot that isgrounded upon your own enterprise data.
So instead of just going tocopilot and saying, Hey, what's

(04:20):
the weather gonna be like?
Or do my high school research projectfor me based upon an online encyclopedia.
We're talking about copilotnow, integrating with our
organizational knowledge base tobe able to ask questions about.
Past client information, our productdevelopment, internal documents,
pretty much anything that we've got.

(04:42):
where does it go look?
What are the sources of thistype of Microsoft 365 copilot,
if I've got it named correctly?
Right, so yeah, you hit upon,one of the key reasons why people
are rolling out these, tools.
It's really, tapping into the knowledgebase that we have that is manifested
through all the files, all this,SharePoint sites, all the things that

(05:04):
we've accumulated within the, Variousstorage mechanisms that we have that
Microsoft is now hosting for us andMicrosoft copilot, Microsoft 365
copilot taps into that knowledge baseand does retrieval of that knowledge
think of it basically like this?
we all were enamored with ChatGPTwhen it first came out, but think

(05:27):
about it again, anchored on yourown data, your own enterprise
content, not just the, web content.
and so the, opportunity for usto quickly find institutional
knowledge is what is acceleratedand what, we are empowered with.
And to put it a put it in a slightlydifferent spin, consider that a lot of
us are knowledge workers well, what helpsknowledge workers become more productive?

(05:52):
You give them knowledge or yougive them access to knowledge or
accelerate their access to knowledge.
And fundamentally, that's whatMicrosoft Co-pilot for three,
Microsoft 365 co-pilot does.
It accelerates our ability to quicklytap into the knowledge base that
exists within our enterprise already.
Now, as I recall, a diagram that youhad, which I really liked, and I've
used a static one, but you've madeit an active one, and for a lot of

(06:16):
the people listening online, I'm justgonna describe a pyramid with four
different layers at the bottom layer.
The base is data, then information,and then knowledge, and then wisdom.
And what we're talking about here isat the top of that pyramid, the wisdom,
the ability to make quality decisions.
With a sufficient amount of input thatwe take into account the relevant data

(06:40):
that works its way up through there.
So as we find out, you could have a hugeamount of data, but a relatively few
number of wisdom choices that you make.
But those are really important.
Now, one of the thoughts that somepeople have said is that, Hey, what
we wanna do is we want to restrictaccess to certain pieces of.
Data or information at the lowerlevel, the people who don't need it.

(07:01):
So Bobby, the intern, for example,should not be able to go ahead
and say, give me the Q2 and the Q3financial projections and tell me
all about the potential mergers andacquisitions we have coming along.
But the CEO ought to be able to do that.
And so one possibility suggests that,hey, let's just go ahead and narrow
the amount of information that'savol available to Bobby, the intern.

(07:22):
that means that the knowledge and wisdomthat's gonna obtain there, it's gonna be.
minuscule as compared to having theentire dataset and then applying, if
you will, a shim or a filter to say,the CEO is allowed to ask this question.
Bobby the intern, not so much.
Now am I describing what is thepotential issue here or is there

(07:42):
a better way to look at it?
'cause of course it was yourdiagram when I first saw it a
couple months ago on Cruise Con.
Yeah, actually, It's not, just tobe clear, it's not my diagram as
much as it's my application of a,existing diagram and, the existing
diagram is called the DIKW pyramid.
And, G Mark, you went through andexplained what each of the layers are.

(08:04):
and the perspective here is, the DIKWpyramid gives us a way to understand
the, some of the challenge spaces thatwe'll face with AI itself, and, but
also it will give us a perspectiveof how to tackle it as well.
the first thing to really recognizewith the DIKW pyramid is that we've, it
we've, we have a new abstraction layer.

(08:26):
And that abstraction layer atthe knowledge layer is where we
have, again, knowledge centrictools for knowledge workers.
But if we're, if we see, an oversharingproblem at that knowledge level, our
general tendency is to revert to usingtools at the data and information layer.
What are those tools?
They, include, for example,fixing permissions or

(08:49):
trying to classify our data.
Think about all the thingsthat we currently try to
do around data governance.
Those are those activities thatare trying to fix the problem at
these lower layers of the pyramid.
Unfortunately, it has the negative sideeffect, that G Mark you mentioned, which
is, if I'm squeezing the bottom partof this pyramid, the top part also gets
squeezed as well, which basically meansthat you'll end up with a dumb copilot.

(09:13):
Okay.
and we've seen this actuallyplay out already several times.
Microsoft is trying to helpaddress some of these problems.
and by the way, let's also, I want tocalibrate something real quick, quickly.
the way that, copilot works is it.
Adheres to the permissionboundaries of each user.
So the, the perspective here is fromMicrosoft's view is, look, if you

(09:37):
have an oversharing problem, it'snot because co-pilot's oversharing,
it's because you already have a datapermission problem to start with.
You've given people too much permissionto too many things, or, you haven't
really locked down your environment.
So let's consider locking it down, right?
Let's see what happenswhen we lock it down.
When you turn on, tools likeSharePoint restricted search, for

(09:58):
example, SharePoint restricted Searchturns SharePoint into unSharePoint.
I.
SharePoint, by default is meant to share.
So by default it's an allow,it's an allow by default, denied
by, exception sort of view.
SharePoint restricted searchflips that on its head.
It's a denied by default,allow by exception.

(10:21):
But imagine trying to findanything in SharePoint if you
can't search for anything.
Okay.
At that point, why callit SharePoint anymore?
It's not really doing its job.
let's now put copilot on top of that.
And unfortunately, when you put again, ashrunken data and information base into

(10:41):
something like copilot, you just end upagain with not a very useful copilot.
And remember the reason why thepurpose that people had in trying to,
roll out tools like copilot is to armknowledge workers with more knowledge.
The biggest problem that we have in mostenterprises, it's not a security problem.
It's not an oversharing problem.
It's actually an undersharing problem.
We don't share enough within ourenvironment, to help in, workers

(11:06):
be as productive as they can be.
But it's also because we, we alsotry to address the oversharing
problem at the same time.
And the, perspective I have is thatthe, what we discovered is that, trying
to solve, trying to fix the securityproblem actually exacerbates the.
undersharing problem that we see as well.

(11:26):
So what we have then isreally a three bears model.
Too much, too little and just right interms of the amount of sharing, which
is a good idea because the pitch herefor doing this 365 copilot is what?
More efficiency, better productivity.
Now, one of the things that Iknow it can do is it can go back
and look through your old emails.
For example, if I wanted to go ahead andI said, Hey, I haven't talking to Sounil

(11:50):
for a year and I want to go ahead andpick up where I left off, it could go
through, find everything that we talkedall the way back to whoever far back I go
and give me a little summary saying, Hey,his dog's name is this and ..., great.
Now I can go ahead andI got the old Mackey.
64 or whatever the number was withregard to all the things you're

(12:10):
supposed to know about somebody.
But this is now retrieved within amatter of seconds, not digging through
and trying to go ahead and said, yes,there are more emails on your server.
Click here and wait for them.
Also, as you mentioned, SharePoint.
I went through a SharePoint deployment acouple years ago where we went ahead and
it required a lot of meticulous, gettingevery file over there, putting it into

(12:31):
the right bucket for the right client,getting the permissions right, so that
we knew that, people who are workingon project A, were not gonna be looking
at files and project B and vice versa.
But sometimes people are onmultiple project files and we
spend a ton of time doing that.
That said, of course, once you haveeverything nice and shiny and laid out,
ready to go, as soon as you click go.

(12:52):
You start to migrate away fromthat perfection model and hence the
potential issue that we see of theoversharing rather than the undersharing.
A traditional DLP or data lossprevention is designed to do what?
At the simplest level.
It's a, what we call a dirty word list.
It was trying to say, you havea, A project green, frog, and you

(13:15):
say, Hey, you don't often talkabout that or make it a purple frog
because you don't really see those.
And so the phrase Purple Frog should nevershow up in ordinary email or conversation.
If it does, it's probably a documentmarking that says, Hey, this is sensitive.
you get the wordsensitive or confidential.
You get false positivesall over the place.
Hence the importance of marketing.
But we can't depend onthese tools necessarily.

(13:38):
Why?
Because humans are involvedand well, humans make mistakes.
They get careless, they take shortcuts.
Hey, I will get back to that.
And then the world gets busyand you never ever do that.
So if we look at thosepotential challenges.
With regard to somethinglike that for productivity.
And you mentioned about the above andbelow, what about things such as data

(13:59):
governance questions, integratingwith other tool sets other than just
the Microsoft database, if someonehas a box or a Dropbox or some other
third party external, does that allwork together in that or is copilot
just simply saying, we're gonna lookat your email and I'm gonna look at
your SharePoint, and that's about it.
there are tools like Glean, for example,that touch upon more than just the

(14:22):
Microsoft 365 universe that you mightbe in, and we should expect to see
that with, other LLM based searchapplications as well, whether that's,
ChatGPT enterprise or PerplexityEnterprise and so on and so forth.
and let's not even, we can, but wecan, the whole agent problem's gonna be
pretty, gnarly as well because it's gonnacross many different boundaries as well.

(14:45):
So at the end of the day, it's it's aproblem that is rapidly growing and,
the old ways that we've done datagovernance simply has proven to not work.
I'll, you mentioned that earlier, interms of trying to organize everything
into the right servers or right folders.
consider your own computer for a moment.
How well organized is all your contentin the right folders, in the right

(15:08):
places, and if you're, unless you'relike OCD, which I'm sure some of
us are, more than likely, I wouldguess that you just have a lot of
content just stored in various places.
I. And you use search to help findwhat you really need to look for.
And that's gonna be the same, that'sjust gonna be the same mode of operation

(15:29):
for us, in the future, even now, butgoing forward in the future as well.
That is why these tools are so powerful.
The large language model based,search tools are so powerful because
I no longer have to be precisein saying, look for Purple Frog.
I can say, oh, what wasthat colorful, organism.
That I was trying to find, and so Isearched for Colorful Organism and it

(15:52):
says, oh, did you mean Purple Frog?
The ability for the large language modelsto make that association without having
to have precise, regular expressionmatches is really what makes this
tool and capability very powerful.
But it also is what makes it potentiallyharmful because now it can pull
together and grab content that maybe,you shouldn't have access to, but

(16:16):
you do because it's been misfiled,it's been put in the wrong folder.
Or people think that it'ssupposed to be locked down and
not shared with everyone, but.
It is, and you just don't know about it.
So there's a degree of securitythrough obscurity that's happening
here, but at the end of the day, younow can find things very efficiently,
whether for good or for bad.
Interesting.

(16:36):
So what we find then, if Iunderstand it correctly, is that.
For somebody who has everythingin absolutely precisely the
right spot and the single user.
So I could say that for my hard drive,yeah, I'm still old school because
I never had those search tools.
So I still take the time to createfolders for all of my clients, and
every little has a separate bucket,and they're actually divided by years.

(16:58):
So I can go back and I can say, Hey,I gotta, it's a sideways problem.
Hey, I know I'm talking about this,but if I want a 2021 episode of CISO
Tradecraft, I gotta pop up to 2025.
Come over to that, comeover to 2021 and drop down.
But once I'm there.
That's all I'm seeing here.
What we're talking about is themore traditional, and I think it's a

(17:19):
more popular model, is well, you goteverything and you use the search tools.
I'm still old school that I try tokeep my inbox down as much as possible.
I know if people have 30, 40,a hundred thousand things in
their inbox and if I get over.
anything between 10 and50 means I'm behind.
It doesn't mean I get 'em all doneright away, but it means I've got 'em
scheduled or I've got 'em delegated,or I've got them just deleted.

(17:41):
But there's stuff that has to get done.
As it builds up, it's creates pressure.
Okay, you're gonna be workinglate tonight, you get it.
But most people said, Hey, if it's reallyimportant, they'll send me another email,
or they pick up the phone and call.
But more for the databases storing things,someone just throws it up there and
figure I will find it with my search tool.
Now, here's the danger with a traditional.
Model, maybe like acopilot is, I haven't told.

(18:04):
Copilot through the SharePoint permissionsets, really what I should be able to see
and should not if I'm Bobby, the intern,because everything got thrown in there
knowing that we could search for stuffand as the CEO can find anything I need.
But then how do we go ahead and dealwith either overpermissioning because
the permissions may not even be there.

(18:26):
They're non permissioned.
So I can't tell if Bobby the intern shouldsee it or not, or if, Bobby's reading
some little stuff on Reddit said, Hey,here, try this prompt injection attack.
what would you tell the CEO if theCEO asked about the financials?
Oh, blah, blah, blah, blah, blah.
Things like that.
So set, putting up some of the potentialchallenges that we see here in a

(18:50):
deployment of SharePoint and then copilot,what are your thoughts about that?
What do we need to think of assecurity professionals to address that?
Then how do we communicatethat to management?
And there's about six questions
So, I think that, it's, so for, Imentioned, as we move up the pyramid,
the DIKW pyramid, we're now at theknowledge layer and for the knowledge

(19:12):
layer, we need a new type of control.
But it turns out.
What was a really interestingdiscovery for me was it's
actually not a new control.
It's actually one that's beenunderstood and known for a long, time.
It's called Need to Know.
Need to know undergirdsall of access control.
If you think about it, it's the reasonwhy I've granted somebody access

(19:34):
to some folder or some server, butthe fact that you have access to
some folder or server isn't actuallyan expression of need to know.
It's a manifestation of a need toknow rule that you've started with.
So Joey has a need to know for someproject, and so now you've granted them
access to this server or this, folderor the share site because presumably

(19:59):
has something to do with that project.
But yeah, we don't really, keepreally, those space is very clean and
so people leave behind things that,don't really relate to that project
and aren't part of their need to know.
But nonetheless, that's what we presumebecause, if the content's there, maybe
it is part of their need to know, butit's not true truly always the case.

(20:22):
that's how Copilot operates.
It says, oh, you have access to it,so you must have a need to know, and
therefore you if you ask about it.
But here you go.
And so someone accidentallyputs a spreadsheet full of
salaries into the wrong folder.
Guess what?
You now have a need to know foreveryone's salaries on that spreadsheet.
Now any, manager with any answerdiscretion would understand No,

(20:44):
no, no, You don't have, even if youhad access, you don't really have a
need to know for that kind of stuff.
And, here's the thing though.
The, we, again, it's an old formof access control, need to know.
But think about this.
Where is it written down?
W where is that captured?
The, actual original manifestation ofneed to know, which is Joey has a need
to know for this particular project.

(21:06):
Yeah.
Based on their job functionor whatever it might be.
again, the artifact, the manifestationat this file level isn't really
an expression of need to know.
It's just a machine level, accessthat someone has, but it doesn't
actually express need to know.
And we've tried to reverse engineer this.
But what if you actually startedwith need to know to begin with?
What if we started and said, what are theneed to know boundaries of a particular

(21:28):
person based on their job function, basedon, their role, whatever it might be.
And to systematically articulate that andvalidate that and maintain that and keep
that, up to date so that we could thenprovide that to large language models
and say, here's the boundary for eachperson based on their actual need to know.
And that allows us to then say, okay.

(21:48):
Regardless of what copilot cangain access to, we can keep copilot
within, the appropriate, knowledge,boundaries that a person has.
Wow.
So it's really looking at it froma different perspective in terms
of trying to assign permissionsdown here and get everything right.
We're looking at it from the top,saying, from the perspective of
a particular user, whether it'sa CEO or the Bobby, the intern.

(22:12):
Using that as an example, if we definewhat that role is, then using some
sort of algorithm, intuitive approach.
Whatever we wanna do, wecan then do it for humans.
Look.
Pretty natural to understand thisneed to know, and someone comes
in and said, can I see that?
no.
But if the boss comes in, may I see that?

(22:32):
Yes.
Yes.
Here you go, ma'am.
Take it with you.
And so what we're trying to dois replicate what for humans
is pretty straightforward.
We've lost that degree of controlby entrusting everything to a set of
mechanical rules and access controlsthat will do unfortunately, exactly
what we tell them to do, whether it'sthe right thing or the wrong thing.

(22:54):
And they'll do that.
And then that brings up issuesfor things that are perhaps even.
Dynamic.
So what about, for example, apharmaceutical company where they did
research and that could shift aroundbased upon, hey, we've got a whole
bunch of products baking, but somethingtakes off thoughts of how you would
explain that to somebody about wherethe challenges are and then how a fix
might apply in that type of an industry?

(23:17):
so there are a couplethings I wanna unpack there.
Okay.
so first is, we tend to look at, we inthe security tend to think of need to
know from a very restrictive standpoint.
Look, you don't have a need toknow, therefore can't tell you.
But if you think about theword, need to know for a moment.
For most people actually, besidessecurity people, they can look

(23:37):
at it as a permissive thing.
I have a need to know.
I have a need to know for thisbecause of my job function.
And that is actually a very,think of that as a permissive
thing, not a restrictive thing.
And when we look at it from apermissive standpoint versus
a restrictive standpoint.
So think of, the restrictivestandpoint as our traditional security

(23:58):
department of no (NO), tell you.
But from a permissive standpoint,once I understand somebody's
knowledge, boundaries, I canbecome the department of KNOW.
How do I actually help individualsperform their job functions better
and faster, more efficiently?
Because I actually give themthings that they don't currently

(24:19):
already have access to.
Okay.
But they should because it'swithin their need to know boundary.
think about this for a moment.
How often have you been told, Hey, go.
Pull this file or go get this, you geta link from, Google Workspace or from
Microsoft Office and you're like, click onit, and you're like, I don't have access.
So now you gotta get access.

(24:40):
Okay, but what if the system alreadyunderstood your need to know boundaries
and said, okay, you don't have access,but let me actually still get it
to you, or at least give you enoughof a hint that exists that you can
then ask for access if you need to.
it depends on one's risk tolerance,but at the end of the day.
The ability for the system tounderstand your needs on the boundaries
and deliver content within yourboundaries, that's actually quite

(25:03):
transformational for the business.
That is what we actually trulywant, because that addresses what I
mentioned earlier as a significantproblem in most enterprises,
which is the undersharing problem.
I.
Yeah.
And, one thing I remember from manyyears ago was taking a course from
Professor Lance Hoffman at GW University.
I think he's still aroundas a professor emeritus.

(25:24):
And we had Dorothy Dennings book,and in the link last chapter on
cryptography, you talk about trackers.
And the idea being is that youmight have a database that.
For example, US Senate Census, where Ican't say, tell me my neighbor's salary.
No, we're not gonna tell youthat, but you can query it.
And then they say, if I can query,give me everybody in this street, then

(25:45):
getting everybody who has this typeof thing and every, and it turns out
that the intersection of that onlyends up being one particular entity
and you may be able to tease out data.
And so that's really somebodytrying to game the system.
But you're talking about a morefundamental issue, which is
just, let's go ahead and apply.
The reasonable approach to say,what do you need to get your job

(26:06):
done is compared to the no, whichis, you can't get it unless you've
been given a permission slip.
Now, our laws, however, and welook at disclosure laws and breach
notification laws, those are allbased on, you can't get at that.
You may or may not have needed it foryour job, but if it gets loose, somehow
you've triggered some particular offense.

(26:28):
So if you look at things like the.
NYDFS 500.7.
If we look at things like that, whichlimits user privileges to non-public
information, it's requirement.
If you're gonna comply with that,New York Department of Financial,
services, reg, how does that apply here?
Does it, and if so, is there anydeficiency in just raw copilot for

(26:49):
365 that can't get the job done?
Means meaning we need more than that.
Yeah.
okay.
So first of all, we've talkedabout the problem at this file
level, fixing the file permissions.
hey, let's, not overshare specificcontent, but with large language models,
we actually have a whole new type ofproblem that is completely unaddressed

(27:11):
by fixing problems at the file level.
And the reason why is becauseLLMs are inference engines.
They can infer things even if you don'thave access to things or even if you have.
So if you have legitimate access to,let's say, future building space and
future equipment purchases, mightyou have some insights into how the
company's doing, or, what you know,whether layoffs are pending or not.

(27:34):
You may not have access toany layoff documentation, but
nonetheless, you can infer things.
So when we look at the NYDFS rulesand it says, Hey, no, NPI non-public
information, it needs to be properlysafeguarded and so on and so forth.
usually they're talking about a verydiscreet piece of content that, that,

(27:55):
you're supposed to properly lock down.
But if you think about it, what we'rereally also truly asking for is even those
things that could potentially be inferred.
This is really the bigger challengebecause there is no permission you
can block at that level, right?
It's not like you can say, restrictaccess to this file 'cause it's
not about this one particular file.
And again, you may have legitimateaccess to that file as well.

(28:18):
nonetheless, how do you get amachine to understand, wait, hold on.
You need to have more discretion.
You can't just spark,spew my secrets here.
you need to be able toproperly safeguard this.
Type of knowledge, this inferredknowledge, and the only way to do that,
as far as I can tell, is to actuallyinstruct these large language models
with these knowledge boundaries that tellthem, here's the boundaries upon which

(28:41):
each person based on their job function,just based on the jobs that they have to
do, what they should and should not know.
So really what we have then is anoperational challenge where to make our
business more effective, more efficient byproviding additional access of information
that has then been filtered, if you will.

(29:01):
So I don't end up saying,okay, boom, here you go.
Here's a terabyte of data.
Here it is into your LLM where you hadsaid it's being an inference engine.
And so it doesn't just have torecite Fact A, Fact B, Fact C, but
it could stitch those together.
And yet our permission sets are notbased upon anything like that at all.
And then that brings me probablyto what you guys have been building

(29:23):
and what you've been working on andwhy it becomes very essential for
an enterprise to be able to addressthese issues because we've got the
liability that exists of oversharing,of improper knowledge management.
There is a level of risk associatedwith that, that we need to communicate
as security professionals to our seniorexecutives 'cause IT, or the marketing

(29:46):
department, or if somebody, developmentmight be all in saying, let's just
go ahead and go after this tool.
It's AI.
What could possibly go wrong?
And even if it's not shadow ai, butit's rather sanctioned from a security
perspective, you don't wanna bethe voice in the wilderness crying.
Beware Beware, Because you're gonnaget shouted down by those who want

(30:08):
those quick productivity gains.
So how do we add ourselves to theconversation in a way that says, Hey,
we'll get you there, but in the samereason that you don't want to just
hop in a car and start driving withoutunderstanding how the seat belts work
and how the brakes work and how a turnsignal works and all these essential
safety equipment which are there tohelp you get to where you're going.

(30:33):
how do we then communicatethat effectively?
And then that then I think isthe introduction of a tool like
you're doing there at Knosticto say, but wait, there's more.
Yeah,
first of all, let, I wanna be clearthat the, foundation of what we're
doing is systematically capturing anddefining, capturing, managing need to

(30:56):
know rules within the organization.
think of it, think of it as, similarto how we do network segmentation.
It's what we're doing asknowledge segmentation.
Knowledge segmentation or networksegmentation, if you think
about it, is also the foundationfor something like Zero Trust.
So you can think of, I, I hateusing the term zero trust,
in a way that doesn't quite.

(31:16):
Isn't quite correct, butin some respects we are.
What we are doing is creating thefoundation for, establishing a
framework to enable, AI systems to alsoadhere to a zero trust model as well.
Okay.
And in that context, what we'rebuilding is a foundation that
supports a lot of other things.

(31:37):
And those other thingsinclude, for example, quickly
doing knowledge discovery.
Okay?
Finding things that are overshared,but also empowering potentially
the next generation of a wholebunch of other security tools
that, think, knowledge level.
Knowledge loss prevention,not just data loss prevention.
Think, again, knowledge classification,not just data classification.

(31:57):
So this whole perspectiveof moving up this next tier
is what we're, building out.
But the immediate use case that peoplegravitate towards is that oversharing
problem because they're trying toroll out tools like copilot and they
don't wanna be the, the security guythat basically says, sorry, I can't
let the business move forward adoptinga tool like AI systems because.

(32:20):
'cause I'm concernedabout security problems.
you can only hold the train back so long.
In fact, for many organizations,they really can't.
And so what you have to do is youbasically have to go along for the ride,
but how can you go along for the ride ina way that helps you quickly understand
and assess your risks very proactively.
And in our view, the best way todo it is to do an end-to-end test.

(32:42):
And the end-to-end test here isn't,saying, oh, make sure all your permissions
are, correct, but rather, let's go andjust fire away a bunch of sensitive
topics at copilot and see what it returns,analyze the, results, and then see,
and overlay our need to know rules ontop of that to see what is overshared.

(33:03):
Again, we start with,all the need to know.
And need to know becomes a,foundation for satisfying a whole
bunch of other use cases that we'veall struggled for a long time.
again, we struggle with datagovernance for a long time.
Anyone here solve it?
Nobody, I'm sure.
Okay.
we solve it to the point where theyfeel like, oh, this is perfect, but
I remember even if it's perfect,you still have an inference problem.

(33:26):
And the, challenge that we're seeinghere is it does demand a new way of
thinking, but as I mentioned earlier,it's not a new way of thinking.
It's an actually an old way of thinking,but applying the old way of thinking
to new technologies that allow us toactually now properly conceptualize
and capture need to know rules sothat we can do the proper knowledge

(33:48):
segmentation inside the organization.
And the knowledge segmentation is bothfor the purposes of understanding what
might be overshared or what should berestricted, but more importantly, what
should be made available to othersso that they can get their job done.
So as we look at that and we say, okay,fine, we are gonna go for along for the
ride, and we're here trying to make.

(34:10):
Our decision makers aware of potentiallysome of the risks and benefits from a
security perspective, we might be ableto say, Hey, how do we evaluate that?
If I can get a boost in productivity,if I can have a user ask 10 questions
and they get eight solid answersback, there's 80% right there.
But if an unprivileged user asks,those questions and they get four

(34:33):
answers back, they shouldn't.
Then you've got maybe a 40%.
Leak problem do, arethose numbers meaningful?
And if so, how do we go aheadand either validate that by
saying, Hey, here's a test script.
Go run your raw copilot againstyour existing enterprise
and see how well we score.
And that, I think would be able toprovide it and IT security with a

(34:56):
pretty strong business case for why weneed something like what you've got.
Is that a valid approach?
So we look at that.
so one of the things that'salways challenging with any
system is how to measure it.
And measuring stuff in AI systemsis somewhat challenging as well.
And one of the things that I'vethought through is how to actually
measure, the, both the improvementson the undersharing part problem,

(35:17):
part of the problem, but also the riskreduction in the oversharing part.
And I think we found a way to do thatby creating essentially an upper bound
and a lower bound for what peopleshould have a need to know for or not
have a need to know for, and to beable to probe for all these different
topics that, that can be unleashed orun revealed through tools like copilot.

(35:40):
what we're doing is we're systematically,again, capturing a sense of what
is exposed, with, with somebody,for example, who has no permissions
to somebody who has a full set ofpermissions and everything in between.
what that allows us to understand thenis systematically for a given person
with a certain job function, are theygetting things that are overshared?

(36:04):
because we can see that from oneview, but also the other part,
which is are they getting the thingsthat they should get access to?
Again, you don't necessarily knowwhat they should get access to because
they may have permissions restrictedsuch that they can't get access to it.
Copilots not gonna comeback to them with that.
But if we have the other upper boundsof understanding what's the realm of
possible that copilot can return with,and let's say again on the upper bound

(36:27):
it, it only returns eight outta 10 things.
that just means that maybe theremaining two things, either copilot.
just can't get that foryou, or it's just not there.
Okay.
so let's set that as an upper bound andsay, look, this is the maximum things
that we can get out of copilot for us.
that's just a really good value to know.
And then subsequently, how much of that isactually made available to our, employees,

(36:51):
for the people who do have a need toknow so that they can get their job done?
Those are all measurements that wecan capture now, and it's, it helps
us really understand is this providingthe, both the value and this, the
risk reduction that we're hoping for.
So what we have then is a challenge wherecan I have some before and after metrics?
Is there something that I could measurethat would allow me to go ahead and
take a look at my current performancein my environment and potentially the

(37:13):
risks, some sort of scorecard thatsays, Hey, if I just had raw copilot
365, this is what's gonna happen to me.
And then something I can comparethe results to afterward where I
go ahead and said, Hey, I've done aKnostic pilot program, it's running.
This is my reduction in risk, andnow I can quantify that because I
know how much a breach report costs.

(37:35):
I know what the potential adverseeffect is of having lawsuits against
my organization, and this is howI can create that value sale.
Is this something Icould do going forward?
Yeah.
So first of all, let's be clearthat the, notion of measurement
is a tough challenge in all of AI.
but I've given a lot of thoughtabout this in terms of how do you
actually measure the, both, the,extent of the oversharing as well

(37:59):
as the extent of the unders sharing.
one represents security risks.
The other one really, representsbusiness opportunity loss.
And the, perspective that we haveis we can actually understand the
upper and lower bounds of that, the,with the methodology that we have.
Because what we can do, from a securityrisk standpoint is systematically

(38:19):
understand, for, someone forexample, that has no permissions,
what are they actually exposed to?
And that sets effectively like a lowerbound and, If there is, let's say,
somebody with no permissions that getsaccess to let's say we were, prompting
for 10 different things that aresensitive and four of them show up.

(38:40):
That gives us a lower bound for thesecurity concerns because it means
anybody can find these things on theupper bound, of the business potential.
what we would do is the exact opposite.
set up a scanning account, for example,that has, permissions to everything and
then prompt for a bunch of things that,people would need to look for to get their

(39:01):
job done and then see what doesn't return.
Because what doesn't return, itrepresents either something that the
copilot system just simply doesn'thave or doesn't have access to, or
is not capable of properly answering.
So you get a sense of what the maximumpotential is and the minimal potential.
And from there, that sets a baselinefor both the security risk that we're

(39:23):
trying to reduce, as well as the businessopportunity that we can potentially cap.
and it's, a combination of boththat we can bring to the table here.
so we're looking at businessopportunity, adding value to the
organization, which is always a goodthing being able to reduce the risk.
Which is going to potentially lowerour cost for adverse events, which is
of course a very good thing as well.

(39:44):
So there's goodness all around.
So listening to this and as a, user ofMicrosoft environments, and I've been,
I'm right now in the process of rollingout copilot, I'm thinking okay, you've
given me an awful lot to think about as aCISO, and I'm thinking that for a lot of
our listeners are watchers the same thing.
So what are the next steps?

(40:04):
If someone said, Hey, this looksreally cool, I wanna do something
with this, how would they proceed?
How does someone get a hold of youand the company and things like
that, and what's their next step?
Yeah, so again, the biggest painpoint that usually people have from
a security standpoint is, Hey, isthis thing gonna overshare and cause,
a lot of embarrassing situations?
the best way to test that isto actually go and test it for

(40:25):
yourself by going through copilotand asking these questions.
And what we've done is justbasically automated that process
so you don't have to do it.
And, it.
Consumes all the results so thatyou can then, so that the system
can quickly, tell you when somethingis actually being overshared.
But you can do this for yourself.
And, what we're gonna do is, release awhole bunch of, suggested prompts that

(40:46):
you try for yourself in your own, inthe comfort of your own home or office.
If you find that these things are beingdivulged, then yeah, keep, give us a call.
Let us know if, we can help you dothis more systematically and again,
these are prompts that are pointingto a sensitive business content.
Things that the business, by theway, doesn't have to, reason over.

(41:09):
They know immediately that if someonegets access to things like unresolved
legal disputes or, salary information or.
Forward looking revenue numbersand the different things that you
should be properly safeguarding and,segmenting to only certain individuals.
then the business already understandsthat and we'll be able to say, okay,

(41:30):
we need to take action on these things.
So this is not like tryingto understand, Hey, should we
patch a CVSS 9.8 or something?
It's Hey, the businessunderstands this intuitively.
this makes that whole, discussionwith the business side much easier.
so that, that's what we are doing.
We're basically saying, let's goand systematically prompt for these
sensitive topics, see what evidenceemerges around, the oversharing that

(41:54):
we, that, you might be concerned about.
And it gives you that assurance to knowis it gonna actually happen or not.
And as you may imagine, in every instancethat we've tried, it does happen.
You just wanna be able to understandhow big of a breadbox is this.
and that's what we do.
What we help you do is also just doit on a systematic repeated basis.

(42:16):
'cause there's always drift, there'salways new content coming in.
you wanna do this not just fora small subset of topics, but
hundreds of sensitive topics.
And so we've scaled that out to be able todo that, for a, large number of different
user profiles and different topics.
but let's go back to my earlier point.

(42:36):
I. My earlier point about the, whatundergirds what we're doing is having
a need to know rule set in place andall the results that you get back.
If you don't have that policy understood,then you won't really know what's actually
overshared because sometimes thingsare appropriate for certain individuals
and they're not appropriate for others.
And that, need to know policy or thatrule set is what you need to overlay

(43:00):
to be able to really understand.
What to action because there's a lotof stuff that's legitimately shared,
but there's things that you don'twanna share with, again, others.
And, the only way to know thatvery systematically and repeatedly
is by starting with, need toknow from the very beginning.
Got it.
So that makes really good sense.
And so the need to know at this pointis how do I get in touch with you guys?

(43:20):
Yeah.
Knostic.ai
that's right.
And they just go to the website andthey could find information they need.
Do they need to contactsomebody in particular?
Yeah, We have a form that you can fillout that, queues you up for, outreach
and, for those who know me, and, forcertainly if you're at RSA, come check
us out at the Innovation Sandbox.

(43:42):
and, if not that, then wehave, other events and things
going on throughout the week.
We'd love to be able to catch upwith you and just, hit us on a
DM over LinkedIn or some otherplace, and we'll track you down.
That sounds excellent.
Sounil, you thank you very muchfor being part of CISO Tradecraft.
I found this absolutely fascinatingwhen I first heard about what
you're working on at Black Hat.

(44:03):
Last year and now I'm gonna get a chanceto spend some time with you at RSA.
Plus thank you for the, kindcontribution of your time and
your expertise to our show.
I think a lot of our listeners have abetter sense of how AI creates all these
great opportunities, but of course they'renot gonna tell you about the risks or
challenges, but that's exactly what we'veaddressed here and what you've come up
with, the solution, with your tool set.

(44:23):
So that's awesome and I thinkthis is time very well spent.
So for our listeners and our viewers, ifyou like CISO Tradecraft give us a thumbs
up or a five star, whatever feedbackyou get, and let other people know where
you're getting your information so wecan help them in their careers as well.
Until next time, this isyour host, G Mark Hardy.
Thank you very much for being partof our CISO Tradecraft family, and

(44:43):
until next time, stay safe out there.

All Episodes

#230 - How To Make Your AI Less Chatty (with Sounil Yu)

Episode Transcript

Popular Podcasts

Dateline NBC

The Breakfast Club

Crime Junkie

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}#230 - How To Make Your AI Less Chatty (with Sounil Yu)