The Data Traps That Are Killing AI Initiatives

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Right now, nearly every major enterprise is
talking about AI how to deployit, scale it, trust it.
But according to a new reportfrom S&P Global, more AI
projects are failing this yearthan last year.
In fact, nearly half are beingscrapped before they even show
results.
It's not for a lack of ambitionor investment.

(00:21):
It's the data, the condition,it's in the strategy behind it
and the culture that supports it.
On today's episode, we'll talkwith two of WWT's leading data
strategists Bill Stanley andJonathan Gassner.
Both of them work directly withFortune 500 companies,
navigating the messy,complicated path from legacy
systems and siloed spreadsheetsto data maturity and meaningful

(00:44):
AI outcomes.
We'll explore why so manyorganizations stall on their
journey, especially in thatawkward middle stage where
things are better than before,but still far from optimized.
We'll also get into what beingAI ready actually means and why
building a culture that truststhe data is just as important as
the tools that process it.

(01:05):
This is the AI Proving Groundpodcast from Worldwide
Technology everything AI all inone place.
In today's episode, we'll getto the heart of every AI
investment right now.
Can you really scale AI withoutgetting your data house in
order first?
Let's get to it.
Get to it, okay, well, billJonathan, thanks for joining us

(01:32):
on the AI Proving Ground podcasttoday.
How are you guys doing?
Doing well?
Thanks for having us here,spectacular.
We are here to talk datamaturity and some of the initial
things that we can do so thatour clients can do to get
themselves along that path ofdata maturity.
I did read an interestingarticle a couple days ago.
It came from S&P Global MarketIntelligence that said more AI

(01:53):
projects are actually failingthis year than last year.
I think it was 42, 45% oforganizations are scrapping AI
initiatives.
How much of that percentage youknow without knowing the exact
results?

Speaker 2 (02:10):
do you think might be attributed to data?
Who wants to go first?
I would guess most of them arerelated to data and not having
the outcome in mind before theyget started.

Speaker 3 (02:22):
Yeah, I think a lot of it is.
When you're you know we don'thave your data ready for any AI
initiative and you just throw AIat it, it's going to magnify
any data problems that you have.
You know we had mentioned thatduring our SCLAI day that we had
a presentation that says, onceyou put an AI program or
whatever software you decide touse, I think it's going to
magnify any data security holesthat you have anything missing.

(02:43):
And I think that's probably agood reason why these
initiatives are failing.
They're kind of wanting thatend result and everybody wants
the end result, but there's someprep work that needs to happen
to get to that point.

Speaker 1 (02:54):
Yeah, and we'll get into what that prep work
actually is.
But I am curious.
I hear so many leaders, so manyorganizations talk about we're
AI ready or going to be AI first.
Organizations talk about we'reAI ready or we're going to be AI
first.
Where is that disillusioncoming from?
Where perhaps the C-suite orexecutives are saying we are
data mature, we are AI ready,but maybe kind of the rest of

(03:14):
the organization is saying, well, let's hold on here a second.

Speaker 2 (03:18):
I don't hear a lot of folks saying that they're data
mature, like when we look at thedata maturity curve and we ask
folks to point out where theyare, it's usually maybe a two or
two and a half.
I've really never encounteredanyone.
But to your point, though,maybe executives aren't aware
that I can't answer.

Speaker 3 (03:39):
If I was to take a stab at it.
I wonder if it's more of a fearof missing out.
You know, kind of likeeverybody else is doing around.
So if you do not have the term,you know, any ai initiative in
your company or something, youwill be left behind the
competition, and you know.
Therefore, it's going to takeeven longer to catch up yeah, I
always go back to data strategy.

Speaker 2 (03:59):
right, because that's where I like to focus.
But if you think about datamaturity and we often think
about it as linear right, it'sone step and then the next step
and then the next step.
But you're always going toencounter disruptors like
generative AI.
It's a huge disruptor.
It disrupts everything,especially your data maturity
journey.
So often you don't want to waituntil you're, you know, stage

(04:19):
four or five to take advantageof this new technology.
As long as you have a strategyand you have the North Star,
that guiding light that you knowyou can align those activities
with as well, you're not goingto get too far off the journey.
But if you don't have a soliddata strategy, yeah, generative
AI can take you to a place youreally don't want to go, and
that has to do with the dataright and not having some end

(04:42):
goal in mind as you develop yourgenerative AI solutions.
But that, when you say I AI now, most of the times, I think
about that and, like the articleyou mentioned, I'm imagining
they're probably talking aboutgenerative rather than applied
AI.

Speaker 1 (04:56):
Yeah, yeah, no, absolutely.
Well, I'm glad you mentionedthe data maturity model that WWT
has come out with, glad youmentioned the data maturity
model that WWT has come out with.
A lot of what we talk abouthere today will be either
included or is complementary tothat data maturity model.
Bill, can you just give us anoverview of that tool, why we
developed it and just a pitch,so to speak, on what it is?

Speaker 2 (05:16):
Yeah, it's absolutely necessary, in my opinion.
I always have opinions andusually happy to share.
It's very important tounderstand where you are and
where you want to go.
Then you can define the stepsto get there, and this tool that
we worked on really helps youdo that.
It's easily consumable.
You can look at it and say, ohyeah, that's where I am Now.

(05:39):
You might not fall perfectly atone level, you might be in
between, you might be like a twoand a half, but it's easy to
see then what is the next step,and we actually broke it all out
.
These are the next steps.

Speaker 3 (05:52):
Yeah, and I think it provides a good common language
between you know anyone thatmight be coming in or doing an
assessment, and also within yourorganization you know such that
you might have.
You know one person like Ibelieve I'm here, let's say I'm
at, I'm at a four, but when youstart doing the kind of pulling
back the layers and you're likeyou're a little bit further
behind and at least we have that, this baseline tool.

(06:12):
Yeah, for that as well.
Yeah.

Speaker 2 (06:14):
And there are a lot of tools out there, right,
really cool tools, and it's.
You have to be cautious.
Yeah, cautious, because therethere are lots of neat things to
do and we always want toexplore and try new tools, but
if you don't, you don't have aroadmap, so to speak, or or?

Speaker 3 (06:33):
some kind of, and I like that.
One more thing on the tool wementioned this in the article is
you know it depends, and it alldepends, on the organization
applying this.
So you might have a reallysmall organization that they may
only have just a few sources,and you know they can apply it
organizationally as a whole.
But if you get a really largecompany maybe, like you know,
Fortune 500 or something youmight have different business

(06:56):
units at different levels.
You know, depending on whattheir individual goals were,
what they're trying to do, andthat's okay.
You can use it in that as well.
It's very, you know,multi-faceted in that way.

Speaker 1 (07:07):
Yeah Well, I'll just articulate a little bit.
So we have, you know, level onethrough five, one being kind of
the lowest, five obviouslybeing optimized, and that's
actually what it's called.
But level one, initial silodata sources on disparate
systems.
Level two developing initialefforts to integrate data from
multiple sources, beginning tostandardize data.
Next stage defined, where youhave centralized data platforms

(07:28):
that integrate data from varioussources.
Step four, or level four, Ishould say, comprehensive data
integration, data automation andmanagement capabilities.
And then obviously, level five,that optimized stage where you
have enterprise level datafabric or data mesh.
So that's one through five,bill, where do we see?
I know it's going to, you know,you know be kind of all over

(07:48):
the place, but are we seeing abulk of our clients or a bulk of
the industry kind of being inthat?
You know, lower stages?
Are they starting to creep?

Speaker 3 (08:18):
up.
Two and a half is the mostcommon.
The most common, yeah, why?
I think a lot of it.

Speaker 2 (08:23):
You know, if you look at the stage one, it's like
everyone's kind of getting yeah,right, because they have all of
these different data sourcesand, as you were saying earlier,
some of them may be more maturethan others.
But to do that at an enterpriselevel it becomes very difficult
, especially going back to datastrategy and now that's the last
time I'll try to say datastrategy, but if you don't have

(08:43):
that selected, it's verydifficult to bring them all up
together.

Speaker 1 (08:48):
Yeah, but why is it so hard to go from that two to
three If you're stuck in two anda half?
Where is the stall happening?

Speaker 3 (08:55):
Yeah, I was trying to think about that last night
actually, I think what happens,you know, when you're in that
level one, we'll start at levelone.
You're excited.
You're like, okay, I don't likethis, I don't like this, I
don't like how this is working,I want it to be better.
So you put that effort in andyou get to level two and it kind
of like your immediate needsare taken care of and things are
fixed.
And then you become complacentbecause it's really kind of hard

(09:17):
to like future plan and I thinkthat might be, you know, just
just a natural roadblock thatpeople experience, like, okay,
things are working.
Now it's better than what itwas.
I could see the benefit fromgoing to level three, but maybe
I don't need that right now andthey might kind of get stuck in
that cycle and redo it.

Speaker 2 (09:35):
That was kind of something I was thinking about
last night, I think anotherperspective is when they don't
have a business outcome, toalign that IT activities to, it
becomes very difficult.
It becomes a technologyexercise and without the outcome
defined first it's really hardto meet the objective because

(09:56):
you don't know what you'redesigning toward.
And then sometimes therebecomes this overload of I'm
going to get all the data, I'mgoing to fix all of the data and
, yes, you can't boil the ocean,you have to eat the elephant
one bite at a time.
All those expressions are trueand that's why we use them.
But I think that's it.
And if we can bring folks into amethodology like we leverage,

(10:20):
where you know we define what isthe business outcome, what are
we trying to accomplish withthis data?
Now let's get the data toaccomplish that.
What do we need to do with thatdata?
And then you know I'm kind ofstealing Jonathan's thunder it
really gets into the dataengineering.
What do we need to do?
Do we need to?
First of all, does it have thequality of the timeliness?
Are there governance risksassociated with using the data

(10:43):
the way we want to?
And then, what do we need to doto combine, refine, enrich that
data and prepare it to answerthese business problems?
So having that and making it arepeatable process to deliver
business value, I think that'sreally the best way to leverage
the maturity curve and to stepthrough that process in in

(11:04):
iterative fashion, rather thanI'm going from step two to step
three.
But why, yeah, why am I going?

Speaker 3 (11:11):
there and you know, I think you kind of look like you
know without that, without that.
You know that north starinitiative.
You know that's where you getstuck at two.
You fix, you know kind of whatI was saying.
It fixes the immediate needs.
So it's like, okay, well, thatwas, you know, my goal, or maybe
my business unit's goal, and weaccomplished it.
There's no, there was nofurther thinking.
And I'd hate to say top down,you know, because you kind of

(11:32):
want that.
You know you want that changeto happen from within the
organization.
But sometimes you might needthat, that higher overpressure,
like okay, as an organization weare going this way, how do we
enable that?
And that's kind of where thatyou know each business unit or
whoever's in, whoever's workingon that, might have a different
approach, but they all know thatthey're trying to work to that
common goal.

Speaker 2 (11:50):
So you know definitely a gen a on
organization is a top down thingand that's that's a good common
goal to go for anything data,really right, A center of
excellence or, depending on thesize and shape of the
organization, it could just be ateam, a team of people, right.
But, like you said, you needsome kind of executive sponsor
or a leader to champion thoseefforts data efforts or AI

(12:11):
efforts and then you needbusiness people representing
companies from across theenterprise, right from the
different business functions, tocollaborate, and then you need
the delivery folks included inthat as well.
So when you have a team likethat in place, it's a lot easier
to iterate through that andreally define things that you
can execute and deliver on.

Speaker 1 (12:39):
Well, we talked about kind of working backwards from,
or, I guess, working forwardsfrom, the business outcome that
you want.
How specific do you have to be?
Do you have a prescription onhow organizations should think
about those outcomes, or is itreally just thinking about
what's best for the business andthen do you then go find the
data that can help enable that,and do you then go find the data

(13:00):
that can help enable that.

Speaker 2 (13:01):
So for me, I think, in terms of performance
management, what are the metrics?
How will we measure businesssuccess?
And that's usually very, veryspecific, right.
What is the number?
How is the number calculated?
What are the variables we needto include to calculate that?
And that's where it comes from,but, yeah, it's usually very,

(13:23):
very specific.

Speaker 3 (13:23):
Yeah, and I think when you're talking, like in
those lines, customer engagementis because that's the thing,
but you might not know.
So you do a brainstormingsession.
I'm like, okay, here's reallykind of what I want.
Let's sit down a couple hours,let's just hash it out on a
whiteboard and then you kind ofbuild your little price
prioritization matrix and like,okay, I'm, you know, I got some
high priority ones, but theseare kind of high efforts, these

(13:44):
are later.
But you kind of want toidentify that like that really
strong, impactful one.
That's kind of a low-hangingfruit, that that gets that
initiative, that gets that thatmomentum going forward.
Yeah, and that's and that'susually how you in all the
customers that I've worked,that's how they get that success
that continues on thosemulti-phase projects.
And you get that data buy-in.
It's like, okay, this works andit really works.

(14:06):
I solved a problem.
It actually impacted thebusiness in a positive way.
Now we can tackle the otherones because we kind of have a
process that works.

Speaker 2 (14:14):
Yeah, that's a good point, and I didn't.
I didn't mean that that theycome to the table with knowing
what all of those metrics areand how they're calculating.
That's a process.
That's that's working withfolks and iterating through that
and and really asking a lot ofquestions Well, why is that
important?
Or what numbers do you look atto to figure that out?

(14:34):
Is there a way to bring allthose together?
People usually don't walk intothe room with those answers.

Speaker 1 (14:39):
Yeah Well, I was just in my head.
I was thinking to myself oncethe business has an outcome
defined, I would assume thatthey would want to then move, or
think that you're in a positionto move rather quickly to go
tackle that.
But there's probably anotherpart, or maybe your team saying
well, we have to go find thedata, centralize the data, make
sure it's all here, digestible,clean.

(14:59):
So what has to happen to thatdata?
From like going to find it tomaking it usable, so to speak.

Speaker 2 (15:05):
Well, if we could back up just a minute, I like to
think of it in terms of fivesteps.
Right, there's the datastrategy, there's management and
governance, there's engineering, and then there's applied
generative AI, and then thereare analytics and visualizations
.
Those things don't happennecessarily one after another,
but they overlap to a certaindegree so often, starting at the

(15:25):
end too, with the outcome inmind.
The analytics andvisualizations if you start with
a wireframe for thevisualization, you say, well,
what is the data that you needto see to make the decision?
And then I can work backwards.
Then data engineering where isthat data?
And then the analysis ofExactly.

Speaker 3 (15:44):
Usually in customer engagements that we have worked
on, it starts with you know wehave a critical business report.
We'll use analytics becausethat's a common thing, you know.
I know the big topic is AI, butthere's also businesses always
need reporting.
Yeah, and like this process ishorrible and it needs to be
fixed.
I need faster turnaround timereport, or I need to refresh
quicker.
You know whatever, and youstart working backwards and

(16:05):
identifying everything thatimpacts that report and more
often like, say, you know how doyou find the data and stuff.
That's an exercise usually initself.
You know, if we kind of look atthat level one, that level two,
where it there's a lot of maybead hoc processes, there's
people using Excel sheets.
They're using the tools thatthey had at the time to build
this report.
They're trying to make itbetter, but you may not have

(16:26):
that complete picture on who'swho.
You know names on jerseys.
Essentially, who owns the data?
How can you access it?
And one, how can you do itsecurely?

Speaker 1 (16:34):
We talked about how it's an exercise, a process.
How long are we talking here?
Days, weeks, months.

Speaker 3 (16:39):
I would say it depends on the size of the
report.
Usually when we do engagements,12 to 14 weeks.
Maybe you can go up to 16 or 18, depending on complexity yeah,
complexity or the requirementsit takes to get into a
customer's environment.
I know that with you knowseveral of our public sector,
you know federal already lockeddown.
There's background checks andstuff that have happened.

(17:00):
So onboarding can take thoseextra times.
But usually that initially,that initial phase, I like
really kind of wrap your armsaround us about 12 weeks and
that's how we should try to getour engagements.

Speaker 2 (17:11):
And that doesn't mean that the next ones have to run
consecutively.
They can run concurrently to acertain degree and you do build
muscle memory right.
After you do that first projectand you get familiar with that
methodology of how you quicklydeliver value, you should see

(17:31):
some return on that investmentright and shorten.
But again, it depends oncomplexity, because some efforts
are just more complex andthey're going to run the full
time that he described.
But that's probably one of themost frequent questions we get
asked is well, how long does ittake?

Speaker 3 (17:41):
Yeah, and I I think with that that first phase, you
know, one thing to keep in mindfor anybody listening out there
and internally is is just thatonboard it's like you're cutting
teeth, you know, you're, you're, you're getting those initial
things out there and then onceyou get that process established
and you're, you know, we'reinto the customer's environment
and we kind of have a goodpicture on what it is any next
thing that we can pipeline alongwith it.

(18:02):
It does significantly speed upand go time to value and you can
usually get, you know, fasterresults there because you've
done a lot of that initiallegwork up front.
It's kind of like it's a reallysteep cliff you have to climb
and then you can go up thegentle hill to get on.

Speaker 2 (18:16):
So yeah, and the first efforts are foundational
anyway.

Speaker 3 (18:19):
Yeah, exactly, that's always the hardest.
Yeah, why is it the hardest youcan run into?
You know people might be on,we'll take it.
People might be on PTOs.
You're trying to get thingskicked off.
You may not have the rightresources identified.
Sometimes, unfortunately, youmight run into individuals with
an organization.
They're just, they just don'twant to embrace that change.
Yeah, um, and it can takesometimes some kind of stuff.

(18:42):
Like you know, we're not hereto replace you or take your job.
We're here to actually makeyour job better.
You know, so you can actually,you know, use your individual
talents and and contribute tothe organization, um, some, some
things like that.
But more often than not, it'sjust trying to locate, to locate
who get the right people in theroom and find those names.

Speaker 2 (19:03):
On Jersey, I think sometimes there are even tools
and hardware that need to be putin place yes, that they don't
already have set up.
So that could be part of thatinitial foundational exercise.

Speaker 1 (19:14):
Well, yeah, dive in there a little bit, because you
know, beyond just the data, thefact that you need to put in
tools, potentially hardware,whatever else it may be.
What types of considerationsthere and any you know, details,
or maybe not recommendations,but just how you should think
about in terms of implementing.

Speaker 2 (19:29):
I'll give you the consulting answer.
It depends it's a favorite, andI say that a little on Jess,
but really it varies greatlydepending on the organization
you're working with and whatthey're dealing with.
Sometimes it starts with toolsrationalization.
They have every tool there isand they need to produce that
tool footprint and get to justwhat's manageable.

(19:52):
Sometimes it's out of control,but sometimes there's nothing in
place right?
Sometimes you're starting fromscratch.
Sometimes it's toolrationalization and reducing all
that overhead, and sometimesthey don't even know all the
tools they have.

Speaker 1 (20:05):
Yeah, yeah.
So it's not just a get yourdata ready to feed the beast,
it's get the beast ready to eatthe data, yes.

Speaker 3 (20:12):
I like that.
That's good.
That's good, get thatfoundational, and that's that.
Yeah, we've done a lot of toolresolution, you know, and more
often than not, you'll getsomething like you have an
individual, like a team ofpeople.
We're going to do it this way,we'll do it in Azure, but the
company's an AWS shop, so nowyou have two different cloud
environments, right, and whichone takes precedence?
That's, you know, real specific.

Speaker 1 (20:32):
Yeah.
So I guess that would speakalso to making sure that you
have all stakeholdersrepresented, because Because I
mean, do you, can you work bothtracks at the same time?
Can you be working to getquality data while you're
getting your systems and ITstack in shape to digest that
data?

Speaker 2 (20:54):
Absolutely, and there are some great tools out there
and you know I or should we,nemo right, that Nemo framework.
You can run that on the cloudand you can develop up there and
then when you get your hardwarein place, you get it all
installed and configured.
You can bring that code backdown and run on-prem so you can
start your development indifferent areas.
There are tools like that thatallow a lot of flexibility.

(21:16):
Absolutely, folks can takeadvantage of that.
Some tools kind of lock you ina little bit, but yeah, are you
looking for flexibility?

Speaker 1 (21:23):
Are you looking for?

Speaker 2 (21:23):
power Are you looking for?
Most customers are looking forflexibility and they don't want
to be locked in.
That's a very weird.

Speaker 3 (21:30):
I hear vendor no, vendor lock in.
They try to avoid that at allcosts.

Speaker 1 (21:35):
Yeah, so well, I do want to touch on the data part
too.
Right before we startedrecording, jonathan, you know we
were joking around.
What's your favorite type ofdata?
You said clean data, clean data.
Yeah, what is clean data?
How clean does it have to be?
What does that even mean?

Speaker 3 (21:59):
I would say clean data, in my view, is something
that you can use and does notcause you friction.
You know, let's say that that'sthat's beneficial to you or the
organization early, whateverthat doesn't cause any friction.
Um, you know, because withoutit you're just spinning your
wheels.
You know, you, you, withoutclean data, you don't know.
Is it right, is it trustable?
Can I, can I?
You know, if I make a decisionbased on this, especially, let's

(22:20):
say, we're gonna financialdecision on this, is it right if
you, if I make a decision basedon this, especially, let's say,
we're going to get a financialdecision on this, is it right If
you get, if you give a reportthat has two conflicting numbers
, which one's, which one'scorrect?
And that's why you know cleandata is important to me.
Well, how do you, how do youclean it?

Speaker 2 (22:32):
Oh, I was.
For me it's consistent data.
That's like a pet peeve of mineand has been for a long, long
time the consistency, and a lotof times it goes all the way
back to application design and Ifeel like there's a disconnect
a lot of times between differentsteps within software

(22:53):
development and just IT ingeneral.
Right, we have softwaredevelopers, then we have data
engineers and then we have datascientists.
But we don't often have a goodfeedback loop and so you may
have a source system, anapplication that's designed, and
when they create one of theinput forms, it's free form.
You could type anything inthere you want.
Now, that's great.
It allows a lot of flexibilityfor the end user.

(23:13):
But when you go to do analyticson that, it's extremely
difficult to make sense of whatwas in the mind of that end user
when they typed into that box,and sometimes it's, if it's a
required field, they might justtype just about anything in
there.
So, having that feedback loopfrom analytics and engineering
to say, you know, well, let'srestrict it to these couple

(23:34):
values, because these will atleast provide some insight into
the activity.

Speaker 1 (23:40):
So consistency for me is probably yeah, well, we talk
about we want clean data, wewant consistent data.
I think, bill, I've heard yousay a bunch of times AI fails
silently and confidently.
I think that's Jonathan.
Maybe it was Jonathan it wasJonathan.

Speaker 2 (23:54):
Well, jonathan, because I love what he says.
Every time it puts a smile onmy face.

Speaker 1 (23:58):
Yeah, so if that was true, if that's what happens, if
it fails silently andconfidently, what are some
indicators that organizationscan look for to at least see
that happening as close to realtime as possible, so they don't
get down the road and have a bigfumble.

Speaker 3 (24:14):
I think, just honestly, that keeping the human
in the loop, you know, doesn'tmake sense.
You can use the commonhallucination how many hours are
in strawberry?
And you can physically seethree hours and it'll say no,
there's two.
That was a common, commonargument that everyone knew
about.
I think that would be kind ofjust.
You know, does this make sense?
Does this sound make sense?
And and if you know, if you'reusing my orbit, I'm a branch I'd

(24:35):
be like this report, like chat,gpt, for example, and you look
at the sources it's pulling from.
You know, I had this happen theother day.
I was doing some research andit made a sound argument but I
went to click on some of thelinks, the research, the
articles that it pulled from andI was like this doesn't line up
.
So I think that when the termslike I will feel silently and
confidently that's what happensis, you might just blindly trust

(24:57):
it.
It is getting better, but Ithink, kind of keeping that, I'd
say fact-checking, but makingsure that you got eyes on the
final part of like, does thismake sense?

Speaker 2 (25:05):
It's so confident.

Speaker 3 (25:07):
Yeah, it's easy to fall for it too Very recently.

Speaker 2 (25:10):
I love Google Colab so I experiment with it a lot.
I had a little data set, littleCSV, and I created quite a
lengthy Python script to gothrough and analyze this data
and I put it all into ChatGPT.
I gave it my data source and myfile and I said validate all of
my, you know my recommendationsand findings on this data.
And it came back and was sayingall these things and it was

(25:31):
quoting attributes that weren'tin my data and I called it out.
I'm like you're making stuff up.
And it was kind of like, yeah,you got me.
Well, no, I'll use your datathis time.
I'm like wow, I mean, why notjust put some disclaimer in
there?
But yeah, I was.
And seriously I was sosurprised the response oh, you
caught me.
Yeah, I made stuff up.
Yeah, wow, yeah.
So yeah, you absolutely have tohave a human in the loop and

(25:54):
validate it yeah, well, I knowyou know.

Speaker 1 (25:57):
One of the questions that I was going to ask was can
ai help help get our data inshape to make a more data mature
organization?
But if we do still have thattrust issue potentially with AI,
how much can AI contribute tomaking this quicker?

Speaker 3 (26:16):
If there are tools out there, it can get you
started.
So it can.
You know there's tools outthere that can do, like you know
, identification, like what isthis thing and you know what
does this look like?
Is this the right format?
How do we think about maybebuilding our strategy and
governance around this?
You know, maybe get your callsets 75 to 80% to review, and
then you can, if you're feelingconfident about it and you

(26:41):
really trust the AI.
But I would venture to say theycan get you started.
You know, kind of like atemplate, almost, you know, or a
form I say a form, of course,but a template that gets you
started.
Like, okay, this is kind ofwhat we're looking for.
I had accelerated that initialtime of the data discovery,
building the governanceframework, the building the
governance framework, the wholenine yards.
I'm like, okay, I got a prettygood picture, view of my data.
Let's start here, yeah, andthen go from there.

Speaker 2 (27:03):
I feel like a lot of those tools are just on the edge
, though.
They're just starting to appear.
They're not very mature yet,but, yes, the future is
extremely bright with using AItools to discover and categorize
and label and help us defineour data asset and organize it.
This episode is supported byDataMiner.
Dataminer delivers real-timealerts on high-impact events to

(27:26):
help stay informed and respondpromptly.
Gain actionable insights withDataMiner's AI-driven alerting
platform.

Speaker 1 (27:35):
Well, as we start to implement some of those tools
that'll help us theoreticallymove faster, start to implement
some of those tools that'll helpus theoretically move faster.
Faster isn't always necessarilybetter.
If we start to see progress orwhat we perceive to be progress,
is there a caution that needsto take place with the business
side of things Like oh, we'vegot progress, let's keep moving
as quick as we can.
What's the balance there withspeed versus solid maturity?

Speaker 2 (28:00):
Yeah, I think you's.
You always have to be cautious.
You have to be careful what youfeed it.
So having to do some discovery,but even what you're going to
let it do discovery on, youprobably need some manual review
of what you're going to let itconsume.

Speaker 3 (28:16):
And I would say manual review, probably from any
of your stakeholders, businessleaders, someone that intimately
knows the problem.
You're what you're trying.
The problem is the businessthing that's trying to solve.
So, let's say, you're trying todo a financial report.
Okay, from a tech side, it'sreally easy to get these tools
like, yes, we're making progress.
Who, this saved me a lot oftime, I can go forward faster,
like you said.

(28:36):
But if there's not anyone there, you know I may not understand
the end reporting or what that.
So, having this home, that'slike no, this is wrong.
And then here's why it's wrong.
I'm like, ah, I can see thatand we can go back and fix that.
A data engineer, yes, well, Iwould say the data engineer
would go back and fix thepipelines.
Okay, and we could also work to.
You know we would work.
You know, hand-to-hand withstrategy.

(28:56):
Make sure anything that was inthe strategy and framework needs
to be updated.
But usually in your reportingit'd be like a BI, an analyst
executive, those kind of things.

Speaker 2 (29:07):
I think there's a delineation between applied and
generative AI and I put dataanalytics and applied AI kind of
in that same general area.
There's a lot more dataengineering that needs to happen
right and to prepare and cleanand prep the data.
Where generative AI focusesmore on unstructured data, you

(29:29):
still have to be cautious withwhat you feed it, but that's
more document review and makingsure that the documents that
you're letting it have access tohave quality.
So there really is a differentapproach depending on where
you're headed, if you're focusedon applied or generative AI.

Speaker 1 (29:45):
Yeah, I mean, it seems rather intuitive about
structured versus unstructureddata, but can you give me a
little bit of why each one mightbe important or what the value
is moving forward?

Speaker 2 (29:56):
No, you don't know.
There are absolutely use casesfor structured data and that's
going to be more of youranalytics and your applied AI,
machine learning, deep learning,things like that.
Large language models are goingto focus more on your
semi-structured data.
I mean, if we wanted to reallysegregate that, or unstructured
data, semi-structured andunstructured data,

(30:17):
Semi-structured being documentsand the definition really being
that the data defines thestructure, right, Whereas
structured data, we build a datawarehouse, we build tables and
create relationships and then weconform the data to that
structure, we make it fit.
Okay.
Where semi-structured is, thedata itself defines the

(30:39):
structure and large languagemodels lean toward that, right,
Because it's.
Those are more based onlanguage and the way we speak
and communicate.
So when we feed it a document,it's really all it's doing is
looking for the, therelationships between the words.
That's why it fails with suchconfidence, because if you
really dig deep down in there,it's just saying, oh yeah, this

(31:00):
word should come next.
It makes sense.

Speaker 1 (31:03):
Yeah, how should organizations approach their
data estate in general?
I mean, we're producing moreand more data every single day.
The number seems to only begoing up.
Do we need to prioritizematuring all of that data, a
certain section of that data?

Speaker 3 (31:27):
Do we need to just prioritize it and get it mature
in a prioritized fashion?
How should we be thinking aboutgoing about the whole data, the
whole thing?
I mean I was still tied back tothe whole you know use case.
I mean it's a fun, it's afavorite word of everybody, you
know what.
What is your end goal?
Because, like I said, wegenerate a lot of data.
Some of it might not be usefulor even needed.
So if you apply the wholeconcept, we're going to get all
my data mature, which is anadmirable goal, and you can, if

(31:47):
you might, use that data forsomething else.
But if you're trying to get tolike I'm trying to train a model
or try to do something, youwant to kind of trim it as you
go along from raw to curated andthat way any of your models
something it's clean you're nothaving, just it keeps the size
down, it keeps your cost downbecause you're also having to
pay the store generally so, yeah, I would go as far as even it

(32:11):
might be controversial, but notnot all data has value.

Speaker 2 (32:14):
Yeah, I would absolutely say that.
So focus on where the datavalue is.
And you know applied AI, youknow machine learning.
That's been around for a whileand I remember we, when we first
started having this dataexplosion a while back, there
was this thought that well, allof my data has value.
A while back, there was thisthought that, well, all of my

(32:36):
data has value.
So I'm going to just startsifting through my data, have my
data science, data science teamdo that and look, look for the
value.
Where are the, the patterns wedon't recognize?
And, um, a lot of thoseprojects didn't end well, yeah,
so again, having the outcome inmind is is is critical, and not
all of the data has value.

Speaker 1 (32:52):
Well, yeah, what data doesn't have value?
A couple examples maybe thatcome to mind.

Speaker 2 (32:57):
So if the data doesn't have quality or
consistency, I would say yeah,and you would hope that you
could get some value out of thatdata, but you might not be able
to.
And then there may just beadditional attributes that you
really don't perform analyticson, that just don't add a
tremendous amount of value.
Like, if you think about thedata science process, really the

(33:19):
first step is EDA exploratorydata analysis and the first
thing you do is trim off thoseexcess attributes that aren't
going to add value to the model,and that's more in the machine
learning context.

Speaker 3 (33:30):
I think if you look for like a good concrete example
, you know, let's say, you'retrying to do, you know like an
AI, ops or some type of IT assetmonitoring and you're gathering
logs, most of those logsprobably aren't going to provide
any value.
You know, like how many timesyou need to know.
You might need to know ifsomeone needs to log, how many
times someone logged in, but alot of SPI features system
events that popped up that arejust normal.

(33:51):
Yeah, you know if you couldimagine if you captured all the
logs and had to sit througheverything on your screen every
time.
You maybe logged a computer,signed a website, plugged a USB
flash drive in, loaded a Steamgame, you know whatever.
That's really daunting andwouldn't provide any value to
you.
But if you had a hardware crash, that's important and you want
to catch that.
So I think it's just filteringthe noise.

Speaker 1 (34:10):
Who's determining whether or not it has value?
Is that where that center ofexcellence kind of keeps coming
down and having all thestakeholders able to plug in
when they need to?

Speaker 2 (34:20):
Yeah, absolutely.
And what are we trying to?
What is the end goal?
What are the metrics?
What are we really trying tomeasure?
So, as you were talking, I wasthinking of sensors, right, we
were talking the other day inthe W the whisk, yeah, right,
about sensor data and you,you're collecting all this data.
Do you really want to saveevery second of data coming from
that sensor or, if there's afailure or an anomaly, have a

(34:44):
certain amount of time leadingup to that and a certain amount
of time afterwards?
Otherwise, it's herculeantrying to process through all of
that regular data.

Speaker 3 (34:53):
I mean, you, you might find a nugget in there of
gold, but I mean, yeah, eveneven if you're trying to capture
something like sensory likethat, so much of it it's going
to cost.
You know those sensors areproducing stuff milliseconds,
yeah, just the sheer volume ofit.
If you're just looking for justhaving to store it, if you just
kept it on a raw form and likeit'll, all data is useful to me

(35:14):
that's.
That's going to eat up yourstorage costs real fast, yeah.
And then trying to process allyour data, yeah.
I mean, and that's, and that's aspecific example.
But yeah, that that's a greatexample of something where
things might be going fine.
You might just look for thenovel and you just need to kind
of maybe capture five minutesbefore five minutes.
I need to kind of maybe capturefive minutes before five

(35:38):
minutes for I'm not in personalmanufacturing, but kind of like
idea.

Speaker 2 (35:39):
Yeah, like your dash cam.
Yeah, exactly, that's a goodpoint.
The dash cam is a good exampleof that.
What's the dash cam?
Dash cam in your car, it justwrites over that easy, records
over, yeah, I mean, and and thenif there's an impact it senses
or something that it saves, oh,I'll save the five minutes
before and the five minutesafter.
Yeah, yeah.

Speaker 1 (35:51):
Yeah, is that starting to get into like an ROI
on data maturity or is?
Are you?
Are we talking about ROI interms of like the actual use
case of what, what the data isfeeling?

Speaker 2 (36:01):
I think it's both.

Speaker 1 (36:03):
How do you articulate then data ROI on the data
maturity, when it's when it'sjust kind of sitting around
ready to be used?

Speaker 2 (36:11):
Well, that that's the key right Is that not all data
has value, but data.
I feel that data ROI can bedifficult to measure and again I
feel like I'm starting torepeat myself.
But going back to that use case, what is the business value
that I'm providing with thatdata and the solution I'm
developing?
If I can really point to it andsay that metric is impacting

(36:33):
the business, it's impacting ourplace in the market, it's
making us a leader in the market, and we can equate that to real
business value, then I feelthat that's almost what you have
to do these days, rather thanjust say, oh, I've got some cool
tools, I'm going to buildsomething and see if it adds
value something and see if itadds value.

Speaker 3 (36:51):
Yeah, I think that's how you get your ROI on your
data, like okay now, because itsupports.
You know I need ROI on thisinitiative.
You know this.
I have ROI on my data thatsupports that.

Speaker 2 (37:02):
Other statements that people would like to debate.
I like to say that data is themost valuable asset and you know
it's something you have toleverage to be competitive in
the market and remain relevant.
There are so many companiesthat pop up, so many disruptors,
and they're new, they're nimble, they're agile, they grow up in
the cloud, they're able toleverage their data assets.

(37:23):
So I'm going to make sure we'recompeting with that.

Speaker 1 (37:47):
Well, another aspect of the data maturity model, at
least from like to get to thatone and two to a couple of
statistics that say you know,perhaps that you know
organizations are not as datadriven as they may like to think
.
Where do we see kind of theindustry in terms of a culture
of data and how do you start todrive that?

Speaker 3 (38:07):
adoption.
I'm thinking I thinkorganizations are starting to
realize that early on.
You know, we've talked to keycustomers, you know, especially
whether it is like okay, this,you know, my data is just not
ready for AI, you know, butthey're realizing that.
So, like, okay, we, to makethis work, we need to change
from within and we need to havea different approach to it.

(38:27):
You been collecting thesethings for years, collecting
this data for years, processingit, working with it, but to
really make it useful, we haveto kind of change that.
The data driven culture, yeah,and I think having, like you
know, your champions, you know,you know some of those, those
wins that we talked aboutearlier, is is kind of showcases
the importance of it and why,why it works.

(38:48):
You know, like, okay, whoo,this made my life a bit easier,
my boss is happy, I want to dothis again, kind of mentality.
And is Gen AI enough to pushthat?

Speaker 1 (38:58):
forward.

Speaker 2 (38:59):
Oh, I don't know that .
It's Gen AI.
I think just data maturity ingeneral.
It's the analytics, it's theapplied and the generative AI.
You have to be able to trustthe data.
That's where I think generativeAI is especially tricky.
It really can help with yourproductivity but you still have
to check it.
But for just data analytics andapplied AI, to become a

(39:23):
data-driven company and have adata-driven culture, people have
to understand they can trustthe data, they can trust the
numbers, they can trust the datasource.
And it doesn't have to be allof the data immediately, but
when they understand and yougrow, that I can trust that data
source, I can trust this oneand that kind of world of data
is expanding for them.
That's for me the most criticalfirst step that I can trust

(39:46):
that data and I can rely on itto make decisions.
It's not going to come back onme.

Speaker 3 (39:51):
Yeah, I think if you were to expand that just a bit,
if you take that first step andthen you kind of start building
like I could trust the data andyou know we'd talk later on the
article, it gets a bit moreself-service.
So you know, any large versionknows it's like I, I need to
find some information.
I don't really know where it is, but maybe now I know I have a
spot for it specifically so theycan do kind of like maybe their

(40:14):
one-off reporting or somethingthat's necessary, without having
a formal process.
But again, they trust the data,the data-driven culture, and
they're like okay, no, this ishow we're going to do it.
You know this is important.

Speaker 1 (40:26):
Yeah Well, we're getting up the maturity model
here.
Bill describes to me that stagefive, the optimized state.
I'm gonna think that it'sprobably a ways in the future
still, but what?
What does that look like withinan organization and what is
what's possible when you're atthat state?

Speaker 3 (40:47):
oh my gosh um I call that our data nirvana.

Speaker 2 (40:52):
It is Originally yeah , yeah, and I think it's a
journey that it's not likeyou're going to get to stage
five and you're like, oh, we'redone, the journey's over, we've
reached nirvana, but it'scontinuous and it just doesn't
end right.
You always have to.
There's care and feeding andthere are always new tools, but
that's when you're optimized toleverage your data assets, make

(41:14):
informed decisions, you cantrust the data.
People have access to the datawhen they need it in the way
they need it.
You have a lot of automationthat's providing that data in
the timely fashion that it'sneeded, and you're able to
leverage the most cutting edgetools the AI tools to help you
find additional value from yourdata assets and look more toward

(41:38):
leading rather than lagging atindicators.

Speaker 3 (41:42):
Yeah, I think with that I mean it's not I don't
think Bill was in a bad lightwhere it's like you know you're
never in.
You know it would be nice tohit that final goalpost of like
ta-da, we've made it Touchdown,yeah, but at that stage you know
you're really, you know thatyour processes work.
You've ironed out a lot of thekinks, you know, as technology
changes because it will you'reready to ingest the new data.

(42:04):
You've already got the teams inplace, you've got the rough,
you got the frameworks, you haveit all nailed down.
So there's communication and itjust kind of comes in naturally
and that enables you, like Isaid, you can leading indicators
, you can adopt new technologiesthat come in because your
data's ready for it.
So in a sense, the goalpost isunfortunately always moving at
level five.

(42:24):
You know I wish it was just anaha moment.
We made it, we can relax andtake a vacation.
Maybe we can do that for abouta month and just enjoy ourselves
that we've worked so hard toget there.
But you want to make sure thatyou kind of stay vigilant and
your organization's data staysready for any new technologies
that come down the pipeline.

Speaker 1 (42:40):
Yeah, well, that's it .
I mean, I love that you saythat you know, as new
technologies come down the line,whether it's something as near
term as a Gentic or AGI orquantum computing.
Yeah, exactly, how does thatchange how data strategists
think about data readiness?

Speaker 2 (42:58):
Tough questions.
How does that think about?

Speaker 3 (43:05):
I can chime in if you want something.
I have an idea.
I think it prepares theirmindset.
And bear with me for a minute.
You know, if we kind of lookback in like the 80s and 90s and
how data was managed you knowwas managed it's a lot different
than today.
But with you know, if we'recoming down with like quantum
computing or something down thepipes, way down the pipes, and
the data might look different.

(43:25):
But you've already, as anorganization and such, has been
exposed to what that process haslooked like.
So you're really not startingfrom scratch.
The data structure may bedifferent, I have no idea, but
you know at least that you areprepared to handle those
challenges because you've beenthrough a lot of that legwork
beforehand and you kind of madea well-oiled machine,
essentially.

Speaker 2 (43:45):
I was trying to envision how the data strategy
might change.
I guess that's the way I tookyour question and I don't know
that necessarily it would.
I think there are really twomain choices in terms of data
strategy, at least right now,and that could change.
But it's fabric or mesh, and Idon't think that would change.

(44:05):
There'd be new technologies tohelp you get there, but the
ultimate goal is that the datais available, it's ready to be
consumed, but yeah, there may benew tools.
I don't know that the data isavailable, it's ready to be
consumed, but yeah, there may benew tools.
I don't know that the strategicdirection necessarily would
change.
Either it's, you know, it'sfabric approach, where we're
getting everything in one place,or it's decentralized in a mesh

(44:26):
.

Speaker 1 (44:27):
Yeah, because I know this is something that we've
brought up before during AI Day,which you know.
For listeners out there, ai Dayis kind of a traveling roadshow
where we're offering you knowexpertise around AI and
practical AI and how you canstart your journey, but from a

(44:52):
data fabric, data meshstandpoint.

Speaker 2 (44:53):
walk us through some of those details, why it's
important and what it will offerpotentially.
You know that optionality.
Yeah, really, they both seekthe same thing and that's to
bring the data in one placewhere it's governed and secured
and controlled and people cancome and access the data.
It's discoverable, it possessesthe quality and consistency and
timeliness.
The difference in theapproaches is that Data Fabric
you seek to bring all of thedata together in one place and

(45:15):
curate it and process it anddeliver it.
That way, where data mesh isdistributed, it's decentralized.
Think of a multinationalorganization with very different
business operations.
Those businesses are probablymature.
That company's grown throughacquisition over time.
They have the knowledge of thebusiness assets internally, they

(45:37):
manage their own data.
But at the enterprise level youstill need to be able to do
analytics.
So if they're serving the dataup as a product to a mesh layer,
there you can perform analyticsacross those disparate
businesses.

Speaker 1 (45:50):
Yeah, that makes sense.
We only have a couple moreminutes left.
It's been a great conversationBased on all the things that we
have talked about.
Are there any not necessarilysmall, but quick, but very you
know, very impactful moves anorganization can make right now
to make sure they're at least onthe right track, if not making
very tangible progress?

Speaker 3 (46:10):
If I had to pick one help, it would be extremely
helpful to get everyone startedto just start asking the data
questions who owns it?
Where does the data located?
Maybe the technology stacksthat's supported, and just
getting a lot of that preemptiveinformation up front that can.
If you're in that stage and youjust don't have that
information, that could save youa lot of time both on an

(46:32):
engagement and it can also getyou to time and value a lot
faster.
And I would lead on probablywhat you might be on a data
governance and data dictionaryframework as well.

Speaker 2 (46:42):
I would say just get started with the use case.
It's so, so, very important andI can't say it enough that you
have to know what is the outcome, what are you trying to achieve
, and with that you can getstarted.
But yeah, that's it.
I love IT and I love tools andI love technology.

(47:02):
But just to go explore withtools or build it and they would
come the field of dreamsapproach, I don't think
necessarily works out very wellof dreams approach I don't think
necessarily works out very well.

Speaker 3 (47:14):
It, unfortunately it doesn't.
I've been on a couple of gameswhere the you know it's kind of
like we don't have what youmentioned, nor do we have the
things I just talked about.
We don't have any idea.
I was like, well, we've built,we've bought this tool, we
bought this platform and we'regoing forth regardless.
So we're building the traintrack as we're riding the train
down the line and and eventuallyit just you just out of track,
you run out of track, you hitsome show-stopping roadblock and

(47:36):
you have invested all this timeand money and resources and you
just don't know what to do andand usually, unfortunately,
sometimes like that they justget scrapped and that not only
kills momentum for that projectbut that likely kills a larger
ai initiative within an yes,because they'd left that.
It had that bad taste in theirmouth.
They're like this didn't workfor this small thing.
How is this going to work forAI?

Speaker 1 (47:58):
Yeah, yeah Well, jonathan, bill, thanks so much
for joining us here on the showtoday.
We'll have you back againsometime soon.
This was a great conversation.
Awesome Thanks for having us.
Okay, thanks to Bill andJonathan.
Here's what we learned today.
First, data maturity is ajourney, not just a checklist.
Most organizations aren't asfar along as they think, and
while the temptation to rushforward with AI is strong,

(48:20):
without a clear data strategy,it's easy to lose your footing
and fall behind.
Second, success starts withfocus.
A single, well-defined use case, one that ties directly to a
measurable business outcome, canspark the momentum needed to go
further.
Without it, even the best toolscan lead to nowhere.
And third, culture matters.

(48:41):
Becoming a data-drivenorganization means building
trust in the data itself, in theteams preparing it and in the
process that governs it.
The bottom line AI doesn't failbecause it's not powerful
enough.
More often than not, it failsbecause the data feeding it
isn't ready.
If you liked this episode of theAI Proving Ground podcast,

(49:02):
please consider sharing withfriends and colleagues, and
don't forget to subscribe onyour favorite podcast platform
or check us out on WWTcom.
This episode of the AI ProvingGround podcast was co-produced
by Naz Baker, cara Kuhn, mallorySchaffran and Stephanie Hammond
.
Our audio and video engineer isJohn Knobloch and my name is
Brian Felt.
See you next time.

All Episodes

Episode Transcript

Popular Podcasts

Crime Junkie

24/7 News: The Latest

Stuff You Should Know

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}The Data Traps That Are Killing AI Initiatives

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Crime Junkie

24/7 News: The Latest

Stuff You Should Know

All Episodes

The Data Traps That Are Killing AI Initiatives

Crime Junkie