Unlocking AI's Potential Through Data Readiness

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Welcome to the IPX TrueNorth podcast, where we
connect people, processes andtools.
Hello everybody, and welcomeback to another episode of the
IPX TrueNorth podcast, where webreak down the biggest
challenges facing organizationsin the digital age.
Today, we're talking about oneof the most frustrating hurdles

(00:22):
that companies face when tryingto adopt AI data readiness.
If AI is only as good as thedata it's trained on, getting
that data in order is extremelycritical, but why is it so
difficult?
So to answer that question, I'mjoined by Mr Tony Cahill, the
mind behind Crystal Onyx, aplatform that removes the

(00:44):
bottlenecks in data managementand helps organizations take
control of their data and AIstrategies.
Welcome back to the show, tony.

Speaker 2 (00:53):
Thank you very much for having me.
Brandy, Great to see you againand really appreciate the time
and invitation to join you.

Speaker 1 (01:00):
I love it.
I love having you back on so,tony, to help our listeners.
I love it.
I love having you back on so,tony, to help our listeners.
You know, help us walk througha little bit of the AI project
cycle steps.
It'll get us grounded for therest of the conversation,
absolutely.

Speaker 2 (01:14):
You know, typically it's defined in AI project
cycles, really five steps, andit starts with the problem.
You know, problemidentification, what are we
trying to solve, what's ourissue?
And then, more importantly,what's the scope.
And how do we break that downto a use case, right, so that's
solvable and also creates anoutcome that we can track.
Next is the data acquisition.

(01:35):
Where's the data we have?
It's a silo.
How do we start to go through,explore it, analyze it and start
to curate data sets that we canuse for submitting into either
the AI or the machine learningand operations, which is really

(01:58):
your modeling?
And so from that comes theoutput tokens.
What do we get out of it?
And there then becomes theevaluation right, that's the
final fifth step Evaluate whatwe've done.
Did we hit the mark?
Do we need to make changes, dowe have to make tweaks?
And so at that point it becomesjust an iteration.
We iterate and repeat, and youcould iterate and repeat all the

(02:20):
way back to, maybe, step oneokay, we have to redefine our
problem or our use case, or isit two?
So you can iterate through anyof those steps, tweaking all of
the elements here to reallyfigure out and getting your
optimum outcome, and this is achallenge, just I see, facing a
lot of companies, and as you gothrough this, there's going to

(02:44):
be breakdowns, and one of thethings that I kind of thought
was maybe interesting to talkabout today was the idea about
data readiness, because I seethat as being one of the most
challenging issues thatorganizations really struggle
with.
How do you get the datastructured?
Where is it?
Are we getting everything weneed?

(03:05):
Are we getting the rightinformation and the right data?
And really we see that thisboth applies to both generative
AI as well as agentic AIprojects.

Speaker 1 (03:15):
You know, and sometimes you know you and I
have talked in the past aboutthe data value chain.
Do you want to talk a littlebit about that?
I think it kind of lines us upas well for this discussion.

Speaker 2 (03:24):
Sure, thank you for asking that.
So the data value chain reallygoes through and it really
becomes part of the whole AIproject.
Because one is how do youcapture, how do you identify
this data?
And, as we're going through it,every organization has just
been storing all of their files,all of their data for years.

(03:46):
There's value to it, right,there's value to the
organization.
If there wasn't value, why areyou storing it, right?
So, with that, how do you startto identify, how can we convert
this from just being able tostore it and start to create it
as an asset that actually can bemonetized or has an ROI
component?

(04:07):
And so, as you're going throughthis and applying either machine
learning or AI elements on this, you're creating inference,
you're creating outcomes tobetter understand the data, so
you can then have strategicvalue of how to create
automation, how to createanalysis, look at how to do for

(04:30):
some different modeling andpredictions.
So, with that, the data itself,those files, have the ability
then to keep with it metadata,information, more about it that
enriches it and starts to reallytell the story about what this
file actually can mean.
And then how it can be appliedis where we start to look at OK,

(04:53):
the data analytics, the datainsights and how that applies,
and then can create value forthe organization, and so it's an
interesting concept becauseeveryone has all of the data,
but it's how you go through thatprocess step to really create
that value chain.
One of the challenges arounddoing that process is how do we

(05:14):
know that we have everythingthat's relevant over years, 20,
30 years?
I have just silos and silos andsilos of different data that we
are either actively using orhave actively used in the past
that has a lot of intrinsicvalue that can be pulled from it

(05:35):
.
How do you make sure that youcan easily get access to those?
And so I see that that becomesa challenge a lot of times to a
lot of larger organizations.

Speaker 1 (05:44):
This is perfect.
So, talking about the datasilos, and you know, with the
acquisitions that companies have, or just becoming global in the
multiple sites that they have,it's really easy to see how
quickly you know this data cangrow and how complicated your
environment can get, and so youknow.
It leads us right into what wewant to talk about today with
the problem statement is gettingback to AI, and why do

(06:07):
organizations struggle to getstarted with AI adoption?
And it's really with this coreissue we're talking about why is
it so difficult for companiesto get their data ready for AI?
So tell me a little bit aboutthat.

Speaker 2 (06:20):
Yeah, it's really interesting because AI has this
big mysticism.
Right, it's this shiny thingAlso it's really interesting.
I mean because AI has this bigmysticism, right, it's this, ooh
, it's shiny thing.
Also, it's really cool.
Everyone wants like we need AI,we want to do it, we know we
need it, okay, so what, thenwhat?
But we're finding what'sinteresting is, once you kind of
get your mind around it, thebiggest challenge isn't AI

(06:42):
itself or being able toimplement the AI project itself.
It's to make sure you've gotthe right data that you're
feeding into it.
So you know that whole ideaabout data readiness.
So for an enterprise business,it's typically a high stakes
environment.
There could be a lot of greatand good things that come up
from it, but there can also be alot of caveats and a lot of

(07:05):
issues and challenges andpossibly mistakes that you want
to make sure that you're awareof.
So as you go into it, you'vegot your eyes wide open, but we
see that at the heart of any AIproject cycle, of those five
steps, it really comes down tothree key things.
First off is what is theproblem you're trying to solve?

(07:26):
If you don't know the problem,you're not going to get a good
output.
It's this whole idea about whenyou prompt an AI.
The more clear your prompt is,the better the outcome of what
you're getting, and that'sreally really clear and really
important here.
The next thing is what are thedata sets that you're going to
use?
And, within that data set, howdo you know that it's everything

(07:48):
you need in order to besuccessful?
And, more importantly, are wemissing anything?
Or are we just doing too much,because you can throw up
everything and then realize,well, we only needed maybe 50%?

Speaker 1 (08:04):
of what we did.
So it's really the idea of isthis what we needed?

Speaker 2 (08:09):
Is this the right data set?
How do we curate and start toget our arms around it?
And it's easier when you'resmaller and have less data to go
through.
But you start compiling it andit just compounds.
Once you start getting multiplesystems, multiple disparate
locations and different types oftechnologies, it can get really

(08:29):
challenging to figure out howto go through this.
Then the other important pieceis probably the last piece of
this is then how are you goingto monitor, how are you going to
test the outcomes and then beable to make the changes, to
iterate and tweak it so you canget to that final optimum
outcome that you're looking for?
So what's interesting I foundis that I've been researching

(08:51):
this now for a long time.
Probably last long time meeting,you know, in AI project world
is probably four months, maybesix months.
But what I find is that youknow in all recent studies going
back and this is articles fromForbes, ibm, different
consulting groups that arepublished on the web that 42%
across the board have identifiedthat on their polls, all the

(09:13):
respondents identified that dataavailability and data quality
has been identified as one ofthe top five challenges for
implementing a successful AIproject and usually it's within
the top three, depending on whoyou talk to.
Once you start to add thentransparency, data governance,
oversight now combining thosetogether, it's 100%.

(09:36):
These are the challenges thateveryone's dealing with and that
, if you don't have a goodanswer or a good approach for,
you're not going to have asuccessful outcome for your AI
project and the result of thatis going to be lost.
Time could be lost.
Does a good behooves anycompany to create a successful

(10:08):
plan for a successfulimplementation?

Speaker 1 (10:11):
Makes sense.
So you know many organizations.
They struggle because they haveold repositories it may be hard
to access that repository, orthey have in the cloud and their
struggle to be able to utilizethat if they're trying to pull
that on a daily basis or manyother potential challenges, and
so to be in the cloud andthey're struggling to be able to
utilize that, if they're tryingto pull that on a daily basis
or many other potentialchallenges.
And so, to be clear, it's AIisn't the roadblock itself in
any way.

(10:31):
It's the data disorganizationor silos, or access and
understanding of what data theyhave.
When you think about thisholistically, that companies
have to get their hands aroundbefore AI can really be
effective and prove the valuethat it's promising.
So cleaning up your house perse right?

Speaker 2 (10:52):
Basically, yeah, because if you look at it I mean
, data is the foundation of anyAI system.
If you don't have it, you'renot going to get anything out of
it, and it doesn't matter ifit's structured, unstructured
third party.
But one of the biggest thingswe see that is a challenge is

(11:13):
dealing with that unstructureddata.
The big data Companies havebeen buying a bunch of storage,
whether it be cloud libraries,archives, all silos, legacy
systems and AI projects, thetechnology.
I'm just amazed at how fastit's been developing and it
really has gotten to the pointwhere it has potential to
deliver some incredible insights, incredible automation and some

(11:34):
of the new things that arecoming out with the agents.
I'm blown away with.
I mean, I remember it wouldtake us months to do something
that now, with like five, tenlines of code, you can come up
with the same result.
So it's going to have a verysignificant effect on how we're
going to use technology to alignthis to how we operate and how
we operate on a daily basiswithin a company and delivering

(11:56):
value to customers.
And it comes back to if thatcompany doesn't have access to
that clean, well-organized dataset, then you're going to be
sitting trying to figure out howto deal with all the fragmented
records.
How do we get access to this?
And it can get costly just forthat aspect of it, and that's
one of the things that we reallyidentify this, what we could

(12:17):
provide value and be of help andservice with.
We also identified that anothersignificant challenge that
customers are facing is how doyou integrate that data and the
data sources, the data fromvarious different sources.
So, again, you know, silo datastored in disparate systems,
making that difficult to compileand create a cohesive data set.

(12:39):
So then you start adding IoTdevices, you start adding remote
systems, remote sites, and allthat starts to add just the
complexity of this dataintegration.
And so some organizations willsay, well, just give us
everything and then we can sortit out later.
Well, you could do that, butit's just kind of like taking
well, hey, you know, I want toclean up my house by buying

(12:59):
another house and moving all ourstuff over.
You end up just with the exactthe same problem.
So the issue with data qualityalone.
You start to see a cost developfrom this and this starts to
then, just alone, you start toidentify well, jesus, we're
going through this.
How much revenue have we beenlosing?

(13:19):
How many operational setbackshave we been having?
How have we had to do all thesegyrations to make this work?
So, and you know just thatalone, not even including AI,
it's estimated that enterprisecompanies are typically losing
probably around 10 to 12 millionannually just due to poor
quality of data and how they'reaccessing it.

(13:41):
So kind of cool, by taking thisproject, an AI project actually
gives us an opportunity to goback and clean the house, to
really start to make things moreeffective.
And so we see that withCrystalonix, we have the ability
to significantly improve theoutcome, not just for the AI
project cycle, but also just forthe organization in general.

(14:04):
And our value-add propositionthat we're delivering is that we
know we can ensure data quality, be able to significantly
decrease the time and cost ittakes to do this and enable the
creation of the data value chain.
So if you look at just that,you know decrease in time and
cost if we go through and takeeverything and upload it to the

(14:26):
cloud I don't care what provideryou want to use You're
basically going through andsaying, okay, team, go find it,
copy everything, move it upthere and then we'll figure it
out Kind of kicking the can downthe road in some respects.

Speaker 1 (14:41):
For sure, and I know some of the numbers that we've
talked about are quitestaggering with regards to what
people are paying today orwilling to tolerate with regards
to time, if they're utilizing,let's say, like a third-party
company, or some additionalresources to try to sort through
their data problem, to try topull this together, and it's

(15:03):
just super prohibitive.
You know the timeline and thecost just to understand what
data we have before we can evenget to this point is it's crazy
staggering.
Love talking with you aboutsome of the different and
looking on your website andreading about some of the
companies that AI adoption justreally streamline this approach,
get them there very quickly,potentially from some of the

(15:47):
things we've discussed, over 10to 12 months of data assessment
in organization down to maybeeven weeks for certain
organizations.
So how Talk?
How does Crystal Onyx eliminatethese obstacles?

Speaker 2 (15:57):
Well it really comes down to because we cut our teeth
on this early on.
Working with high-performancecomputing centers, we saw that
there was going to be achallenge that in these
organizations it was kind of aprecursor to AI and you had all
of these different silos of datathat never did not talk to each

(16:17):
other.
So you had to have differentsystems be able to pull out,
find something, transfer thatover to another system that
would then transfer it toanother system.
So it was really really notcomplex, but it was very manual,
right, and you start doing thisacross large amounts of data
sets.
That just becomes very unruly.
So we kind of got an early viewof what we saw was going to be

(16:40):
coming.
And we worked withorganizations like Bosworth
doing AI autonomous vehicletesting.
We saw the German ClimateCenter, which is the second
largest supercomputing center inEurope, and how they were being
challenged with all of thedifferent types of storage
environments and what they werecurrently trying to do to make

(17:04):
it work.
So we were able to come in and,first off, just be able to scan
everything.
So what's great about this iswe saw this then applies
directly to Advantage within anAI project for a company is we
can go to any storage systemthat they have and scan it very
quickly meaning that we're doingit a million or a million plus

(17:27):
files per hour and start tocreate a global namespace.
That global namespace is just avirtual representation of
everything that we're seeingwithin their environment.
So nothing's moved, nothing'stouched.
We now we just have avirtualization of it.
Once that's done, we knoweverything about the files, we
know everything about thestorage.

(17:47):
It's all there and you canstart running reports.
You're able to then query and,using our intelligent query
system, start to then analyzeand identify everything that you
have without having to everhave to touch the files.
So what's great about that isit's not just a one-time shot,
so it's not like we're justgoing and uploading, adjusting
to somewhere.
The system actually goes backand, based upon the business

(18:11):
rules set by the customer or bythe company Is it once every
hour, is it once every day?
And it's going through and justupdating to making sure that we
have all the latest, greatestfiles and everything that's
updated.
So you're continuing to workwith the live model.
The great thing about that iswe can start using custom
metadata, start to then go andcreate curation sets of

(18:36):
everything that's there sowhat's primary and really
customize it and start to breakdown that data set into what's
the primary, what's mostcritical, what's the second
level critical, what's thirdlevel critical, and start even
creating iteration groups.
So when you're testing you cando A-B testing, what's our
outcomes.
But it's a really easy way todo this and quickly.

(18:59):
And so in our customers thatwe've done this with, where they
were trying to go through anddo it themselves.
One great example use case is acustomer as you mentioned or
alluded to earlier.
They went through and they hadaround three petabytes they were
trying to create the data seton.
It took them about 10 to 12months.
They don't know really whatthey got.
They got some On their secondpass, different project, but

(19:22):
using the same data set.
We were able to go through, geta completely everything and in
fact we found that they wereactually missing files because
they couldn't get access to them.
So we found another petabyteworth.
So we actually had four thatwere able to go through, do this
, get a complete, curated dataset and do it within eight weeks

(19:42):
.
And then their whole thing wentfrom 10 to 12 months with I
don't know how many there's like12 people involved, down to
eight weeks with three people.
So just that cost alone.
What does that mean on asuccessful impact?
And then does that mean that,gosh, we could even do two or

(20:03):
three or four different projectswithin the same time period
that otherwise it would take usto do one right?
So how much more impactful canwe be?
So you know, it's taking adifferent approach, it's
thinking outside the box andit's realizing that if I have
all the data, I don't need tonecessarily work with all those
files and create all thistranslation and do everything
that we need to.
First let's just actually takea snapshot of it, virtualize it

(20:26):
and then work with thevirtualization, the reporting,
the queries, take that curationset, get to what we need to and
then take our next step, whichis okay.
Let's approach the modeling,and then we also enable that too
, but that's a separate approach.
The first thing, that's themost important, is just getting
your arms around this datareadiness.

Speaker 1 (20:47):
It's just a huge shift, right?
I just want to reiterate that'sjust a huge shift to go from an
experience where people areusually doing this manually,
where they've pulled in 10, 15,20 people sequestered to kind of
manage.
That's what I typically seewhenever there's kind of a
project where they need tomigrate to a new tool or a bunch
of data, move a bunch of data,or they're acquiring a company

(21:09):
or whatever this needs to be foranything.
They just grab, they sequestera large amount of people and it
takes months for them to gothrough this sort this stuff.
And so what I hear you sayingagain is that you can take an
example and you've lived thisfrom a 10 to 12 month manual
scenario down to eight weekswith just a couple of people

(21:31):
managing this process.

Speaker 2 (21:32):
And it's unheard of right how fast you can do it if
you have the right tool sets.

Speaker 1 (21:37):
These are the kind of efficiencies that organizations
are really searching for, right?
We spend so much more effortfor much less of an improvement,
so this one, to me, is justquite grand in scale.
So I'm just really excited thatwe're talking about this one.
Yeah, thank you.

Speaker 2 (21:52):
Thank you and really appreciate the opportunity, you
know, and invitation, and what'scool about it is, you know,
even though it's a new tool,you're not actually changing
anything.
We're not saying, oh, you'vegot to do something completely
different, it's just going awaya slightly different way about
it, and so that's.
What's really nice is thatwe're not forcing anybody to
change the way they're going todo business or the way they're

(22:15):
going to, you know, change theirorganization or their approach.
We're just going to help makeit easier.
And what we've found is,because Crystal Onyx is able to
connect all the dots you know,and we create the visualization
abstraction, we start to enablethat flow of data, and this
extends across.
You know all these silos.
So you've got your traditionalNFS and SMB data shares NFS for

(22:41):
Linux and SMB, typically for Macenvironments and Windows
environments.
You've got all the legacystorage environments that go
back years, even back tomainframes, also parallel file
systems.
So you've got Lustre, gpfs.
These are all very, very fastfile systems that are usually
used for supercomputing and forAI and also for any tier zero

(23:03):
where you've got your criticalstorage.
Then we also even extend acrossto S3, object stores, clouds,
multi-clouds and even able tomanage tape libraries directly.
We had one customer where wewere able to go in and, over the
course of four days, migratethe access of their entire 350

(23:23):
petabyte library archive andmake it available for an open
system.
So it was going from a legacyenvironment to an open system
environment and it took us fourdays.
Everyone else was coming in andsaying that's going to take 18
months, 20 months.
It's like, well, no, we canjust think about it differently
and, using what we can do withthe virtualization, we can make

(23:45):
it available to you within fourdays.
So those are really kind of thecool things that we can get
involved with and really helpmake it successful and impactful
.
Difference and the thing I likeabout it is this just is not a
one-stop thing.
You don't just upload it and doit.
You keep everything where it is.
You keep doing your business.
We'll come in and just do ascan, and it's a touchless scan.

(24:08):
It's very easy and it's deployedwithin the environment, so you
don't have to rely on a thirdparty or external.
This can be deployed within acompany, within their data
center, in their environment,either on a bare metal server, a
VM environment or even, if youneed to, a cloud compute

(24:30):
environment or all of the aboveand all work in tandem together.
So it's so flexible and becauseit's a multi-threaded,
multi-process system, that'sreally really the key in this.
Because it's so flexible andbecause it's a multi-threaded,
multi-process system, that'sreally really the key in this,
because it's very fast and we'redoing everything in parallel,
so it's not just doing one thingat a time, it's doing 10, 15
different processes all at atime and, as I mentioned earlier

(24:54):
, typically we see a three-nodecluster will often go out and be
able to do this at a million ormore files an hour.

Speaker 1 (25:01):
What I love, what you just said, and I think it's
worth reiterating, is thatCrystal Onyx gives people the
ability to bring this in-house.
It's the utility that they usethat they can do it.
You don't have to have a thirdparty do this.
You bring the tool into yourown facility.
Your own people are doing this.
They're being able to use andaccess this data in the same way

(25:22):
that you know.
You get them all up and running, and then the intent is that
they got this on their own.

Speaker 2 (25:28):
Yeah, yeah, and that's.
That's the thing, right.
There's no reason, once youhave it, do it on your own.
If you want to have somebodyelse, come in and do it, great,
it's a great tool.
You know it's a forcemultiplier.
It's not.
You're not looking to replaceanybody.
We're just looking to help makethings run easier and faster,
because really that's what itcomes down to how fast can you
do something with the leastamount of effort within that

(25:50):
affects the organization?

Speaker 1 (25:51):
Always, always no-transcript.

Speaker 2 (26:27):
What that means really is that you don't come
out and brush it right.
It's not like you have to cleanit up because it got dirty, but
it's that the data you have isapplying to what you need, right
, so it's not a lot of fodder init.
The way we go about it is whenyou're looking at everything,
trying to dig through thesehaystacks right and figure this

(26:50):
out.
So if you start looking attrends, you start looking at
patterns.
We get interesting ability ofrunning these intelligent
queries, and part of thecapability is not just well, we
look at the file systeminformation, or we look at the
extended metadata, which ispowerful to start with.
But then let's even go deeper.
You start looking at you know,introspecting a file.

(27:10):
Okay, I'm going to open it, seewhat's here, start looking at
the trends and we startidentifying okay, what's
relevant.
It starts to become prettyinteresting to identify.
Then we're looking foreverything that has this
attribute with it and it couldbe an owner, who created it,
what department it came from,what was the time frame, what

(27:31):
was the project.
All of that becomes now relevantfor how you can classify the
data.
And that really is it.
How do you classify right andthen, with that, apply
governance.
How are we going to manage it?
Is this in band?
Is this going to be part of ourproject or is it going to be,
you know, out of range?
It shouldn't be part of theproject because it includes

(27:54):
personal information.
You know, I know, we don't wantanything that's going to
include people's social securitynumbers or their credit card
information or anything abouttheir HR information that
sometimes can get includedunknowingly.
So if you start having theability of using filters, like
we're doing, you can then getvery granular on how you start

(28:18):
to create this data and, aswe're identifying, it we're
grouping it using the custommetadata fields to then group it
into a curated set and thentake that curated set and even
fine tune it down further.
And it's all being done easy.
It's using the power ofmetadata and creating unique

(28:41):
schemas and fields that we cancreate these granular, curated
data sets and do it very easily.
Again, it only takes one or twopeople to be able to go through
this and start to fine tune itand then, when you're applying
it, you're applying it not justacross each file at a time,
you're applying it now acrosshundreds to thousands of files

(29:04):
at a time to create that set.
So that's where it reallybecomes of the quality.
Is that you're gettingsomething.
That is part of answering thequestion or solving the problem
that you've identified to createthe best possible outcome.
Once you've got your final datacuration, identify that we've

(29:25):
got quality, we've goteverything we need.
Now you can use the workflowengine of CrystalLocks to say,
okay, what are we going to do?
We're going to start moving thefiles, we're going to copy them
from where they are and thencopy them to somewhere else, and
Dell will go in and we havewhat we call an out chest, which
is really kind of a checkout.
It's a copy out and you cancopy out to any system.

(29:48):
If you're going up to SnowflakeDatabricks which is a popular
use other GPU providers, othercloud providers you have
something you're going to moveinto a parallel file system and
run through GPU locally.
It just gives you all theflexibility you need to now take
that within weeks, this curateddata set that now you're

(30:09):
applying to your solution ofchoice.
So it just really streamlinesthat whole project from start to
finish.

Speaker 1 (30:17):
Got it.
So you get to scan and identifyall the data extremely quickly
at any repository whatsoever,and then you're using that
metadata to.
You can change your metadata,you can categorize it, you can
protect data, certain data,critical data.
You can utilize that as yourmechanism to do anything that

(30:37):
you need, and what that does isit gives you that power then to
make really good decisions withyour data and where you want to
move it, what you want to dowith it, how do you want to
categorize it, how do you wantto organize that for AI or for
any other tool whatsoever, forany kind of product that you
need.
So it's so.
It really gives you the abilityto then automate things too.

(30:59):
So if you need to constantlymove data that's coming in from
one source or from one site toanother site, you can do that
continually, and it's really alot less manpower than the way
organizations are doing thistoday.

Speaker 2 (31:12):
Correct, and that even includes like for really
sensitive information.
right, we've got a bunch ofdifferent sensitive data and
files creating a data room,right, which is kind of we're
going to isolate this so that wewant to run through an AI
project but we don't want to runand create any impact, right?

(31:33):
Okay, so wonderful.
You can actually use thevirtualization power of Bristol
Onyx to now even export that asa file system, so you can create
a whole virtual data roomwithin your place and not have
to actually build or separate it.
So it just gives you a lot ofcapabilities to do things a
little bit differently.

Speaker 1 (31:52):
You said you should think differently right there,
but that's one thing I reallywant people to take away from
our conversation today is why doorganizations need to rethink
their AI approach Right and sofor companies looking to adopt
AI or starting to dive into this, what's your biggest piece of
advice for either these leadersor even just managers within an

(32:15):
organization that are looking touse AI?

Speaker 2 (32:17):
The biggest thing is make the decisions that are
aligned to what you're trying toaccomplish.
Align to the business, becauseeverything I don't care what it
is, every piece of technologytakes a certain approach and
that certain approach is the waythey are approaching a problem
that may or may not align withthe way the company is operating

(32:39):
, and this includes us too.
The way we're approaching isvery, very specific and we see
that it aligns for a lot ofcompanies, a lot of customers.
That doesn't mean it's the defacto standard for everyone.
So what I do really quickly is,when we first talk to somebody
is I'm understanding, okay, whatare your top three challenges?
How can we help and resolve?

(33:00):
And really quickly I just cantell it's either we're going to
align very well or it's going tointersect, and at that point,
if it's intersection, it doesn'tmake sense.
It doesn't matter, and I seethis over and over again.

(33:32):
Trying to take a round peg intoa square hole.
Right, it takes a lot of effort, you break things along the way
.
It might not be the best way.
So taking a step back andsaying let's approach this a
little bit differently, andthat's really the great thing is
ensure alignment, and thatincludes not just with the
culture and the operation, butalso the people.

(33:53):
And then how do you removeunnecessary steps?
So one of the big things that welook at, and part of this whole
idea of the round ofvirtualization, is why do we
need to come and copy everythingout first?
Why don't we figure out what wecan do, what's needed?
How do we start to create thedata set and curate it and

(34:14):
classify it before we start themove?
I mean, that saves so much time.
It goes from this idea aboutdata lakes right, moving
everything up and creating alake so that you have the
ability to do the same thing, sowe can get away from silos, get
away from disparate systems orput it all together in one lake.
But the problem is, when you dothat, your lake starts to

(34:34):
become swamp, and so what wedecided is we have the ability
to actually virtualize it andcreate a virtual data lake
without ever having to move, andthen get very clear on what
this is, using the tools that wehave, and then start to then
move that up so it becomes.
You know, it's like crystalclear waters at that point, and

(34:55):
we found that just by doing itthis way, companies actually
start to move faster, start toleverage AI, leverage all the
different projects and thethings that they're wanting to
do, and it's without even theydon't even have to now outsource
their data to third partiesanymore, right?
So if you're really sensitiveand you're worried about
creating that these types ofprocesses to move to third-party

(35:18):
and clouds and what have youthere's always a potential for
security breach or thatintellectual property or trade
city can get out.
So this keeps it within thefour walls.
So you have a lot of power andcontrol on where you and how you
are implementing thesethird-party.
And I'm not saying don't usethem.
I'm just saying let's use themand use all of this better to

(35:42):
really get to the outcome thateveryone's wanting to get to.

Speaker 1 (35:46):
Yep, yep.
Use each for what their strongsuit is.

Speaker 2 (35:49):
Exactly.

Speaker 1 (35:50):
And make sure you're focused on that.
So you know we spend a lot oftime on the front side of this
right Data cleanup and all ofthat getting ready to use AI.
And you know, let's talk alittle bit here, as we kind of
come to the close of thisdiscussion, about the final step
, right, closing the AI projectcycle loop and I think you know
you've talked about AI projectcycle four and start the

(36:12):
modeling step Now thatCrystallonics is involved, and
within a matter of weeks we'vegot data readiness is completed,
the data classification stepshave been completed, a quality
data set has been curated, ordata sets and now AI training
can begin.
So tell me what's next.

Speaker 2 (36:33):
Well, before we go to what's next.
I know we kind of touched onthis, but I would say don't
assume AI will solve all of yourproblems.

Speaker 1 (36:42):
Okay.

Speaker 2 (36:43):
Right, you know, if you think it's a panacea, it's
not.
If your data is not ready, it'snot quality, it's not set up,
ready to go, and you're justfocusing on the AI tools and the
cool shiny things, it's goingto fail, right, Because it
really has to start with smarterdata management, and that's
really kind of what we were justtalking about.
And so the faster you canefficiently clean up, classify,

(37:06):
curate, the better your AIoutcome and the better your
insights will be.
And there's a consultant.
Her name's Leah Pinzer.
I came across her about a monthago.
She's an AI advisor, she'sformer CDP of Microsoft and she
had a great quote.
I loved it.
She says that AI isn't asolution, it's a force

(37:28):
multiplier.
And you look at what's going onand it really is.
It's just amazing about whatare the things that, if you do
it right, the outcome you cancreate.
And she goes on to say if yourdata goals and processes are
broken, ai will amplify thatchaos.
And her number one item on herAI project checklist that she

(37:52):
preaches is data readiness Makesure your data is clean, make
sure it's structured, make sureit's accessible, and so that's
the things that we started tofocus on originally, and just we
cut our teeth on it forhigh-performance computing and
we're working in some of themost challenging environments in
the world, where we're goingthrough and scanning hundreds of

(38:16):
petabytes to even exabytes in asingle pass and then being able
to start taking action andcopying and migrating and doing
all the cool things with it, butdoing it at a point where we're
doing it at scale.
And so I think, as you're goingthrough and starting now to
apply this from an enterprisestandpoint, there's a lot of
things that we've learned overthe last number of years that

(38:39):
are perfect lessons learned forcreating a successful AI project
.

Speaker 1 (38:44):
You know, now you're speaking my speak right, we're
talking about business processesfirst, and then tools follow,
right?
Yes, make sure that you've gotgood ways of working, good best
practices, all of that goodstuff, and then you want to
focus, like we've talked about,on your pain points, your
problem statements, and thenmake sure that we really
understand those well before wego and expect any kind of a

(39:06):
magic wand to do anything for us.
So we just really need to treatit like a true project and
really scope it out properly andhave clear expectations.

Speaker 2 (39:15):
Absolutely, completely, 100% agree.

Speaker 1 (39:17):
I love it.
Tony.
We've talked about a ton ofthings here today.
We've had a lot of goodtakeaways.
First, I'd love you know ifpeople are more interested in
learning about Crystallonics orhaving a conversation with you
about anything that we've talkedabout today, or just the
capabilities of your tool andyour solution.
Where can listeners learn moreabout Crystallonics?

Speaker 2 (39:36):
Absolutely.
Thanks, brandy.
They can invite them to comeand visit our website.
It's crystallonicscomC-R-Y-S-T-A-L-O-N-Y-Xcom, or
send an email.
Send it to info atcrystallonicscom, again,
c-r-y-s-t-a-l-o-n-y-x.

(39:56):
That comes right to my email.
I'll see it Respond back.
Let us know what we can do tohelp.
Any questions, anything we cando.
We're here to help and I justlove talking with different
people, different organizations,because every organization has
a chance not only just to sharewhat we're doing, but also learn
more.

Speaker 1 (40:17):
Yeah, I highly encourage everyone to do so.
I know Tony's been working onsome rapid start programs as
well to help organizations get ataste test of the benefits of
what Crystallonics can provide.
And you can do that veryquickly and get a little trial,
a little pilot run for youbefore you dive deep.
Yeah, thank you, that's a greatway to start.

Speaker 2 (40:33):
I mean it's low risk, right, because we kept thinking
we have this great thing thatwe spent just years working on
that we think we find reallyfine-tuned it and encouraged and
wanted to open this up.
So you know, whenever you startany type of new technology right
and implementation goes throughan evaluation, we're not going
to just risk everything we'redoing, put something in and say,

(40:55):
well, okay, well, I hope itworks.
So, thinking through this, Ithought, well, let's come up
with a different approach onthis.
And so let's do a rapid start.
And so, for a very nominal fee,you can go have three months,
download the software, installit on your system in your
environment either whether it'sa VM, on a bare metal server,

(41:18):
even on a cloud compute, inwhatever works for you and then
you'll be up and running withintwo to three hours and then
within another hour, two hoursafter that, you've now scanned
probably a couple milliondifferent files within your
system and can actually starttaking steps through this.

(41:39):
So within half a day you'realready on your way to start
getting ready for data readinessand just start that process of
okay, how do we apply it?
How do we take this and startapplying our model and even
start playing with it.
I encourage everyone try it out.
It's a rapid start program youcan see on the website.
Send me an email at info atcrystallonicscom.

(42:01):
I'll send you a link to getstarted and then we can talk
further.

Speaker 1 (42:06):
I love.
It All right, tony, thank youso much, appreciate the time
today as always, and we lookforward to talking to you again
here back on the podcast.

Speaker 2 (42:15):
Great Brandy.

Speaker 1 (42:23):
Thank you so much.
Great to speak with you.
Take care Bye.
Thank you for tuning in today.

Speaker 2 (42:26):
Don't forget to subscribe and review the show
and for more information on IPX,visit IPXHQcom.

All Episodes

Episode Transcript

Popular Podcasts

Stuff You Should Know

Dateline NBC

Are You A Charlotte?

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Unlocking AI's Potential Through Data Readiness

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Stuff You Should Know

Dateline NBC

Are You A Charlotte?

All Episodes

Unlocking AI's Potential Through Data Readiness

Stuff You Should Know