All Episodes

May 3, 2023 8 mins

In this episode, Brick and Caleb discuss the importance of preparing your data environment for the arrival of AI tools. They highlight the advantages of a data lake, a flexible repository for storing and integrating data, making it easier to analyze and report on. By having an established data lake, organizations can seamlessly leverage upcoming AI tools without having to scramble to access data in their transactional systems.
 
Click here to watch this episode on our YouTube channel.

Blue Margin helps private equity owned and mid-market companies organize their data into dashboards to execute on strategy and create a culture of accountability. We call it The Dashboard Effect, the title of our book and podcast.

Visit Blue Margin's library of additional BI resources here.

For a free, downloadable copy of our book, The Dashboard Effect, click here, or buy a hardcopy or Kindle version on Amazon.
#AI #datastrategy #datalake #BI

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Brick Thompson (00:04):
Welcome to The Dashboard Effect podcast. I'm
Brick Thompson.

Caleb Ochs (00:07):
I'm Caleb Ochs.

Brick Thompson (00:08):
Hey, Caleb. So today I wanted to continue our
discussion from last week. Wespent some time talking about
ChatGPT, and the likely arrivalof AI tools to work with data. I
thought This week, it would begood for us to talk about, what
would you need in place, inorder to take advantage of that
when it comes.

Caleb Ochs (00:28):
Yeah, hopefully we can share our perspective on
what you should do if you're incharge of some data at your
company and want to get readyfor this thing. It's going to be
an important. This is coming. Ithink It's important to stay on
top of it and understand whatyou can do now to make sure that

(00:52):
you're not caught flat-footedwhen this technology becomes
really relevant for your data.

Brick Thompson (01:00):
Yeah. And place where you can point these
tools at it, you're gonna be atan advantage, for sure.

Caleb Ochs (01:10):
I know you've been doing a lot of research on it on
these types of things. Right?

Brick Thompson (01:20):
Yeah, it's pretty astounding. It's still
early days. There's a definitehype train rolling right now. So
some of that may moderate a bit.
I think there's a lot ofadvantages, in any case, for any
company to get data consolidatedand integrated. Whether you're
doing it for these coming AItools or not, there's a lot of

(01:43):
other reasons. It's been aninteresting evolution over the
last five or six years. Five orsix years ago, when someone came
to us to build a data warehouseor do reporting, we would build
a data warehouse, create eightETLs to pull data in from
different systems and have it ina more traditional agile,
Kimball style data warehous.

(02:08):
Over the last couple of yearsdata lakes have really come to
to be the thing that werecommend, for a number of
reasons. We can talk a littlebit about that. So why don't we
start with just describing whata data lake is? For some people
it's another one of those terms,that's just sort of confusing
and conceptually fuzzy. Howwould you describe a data lake

(02:31):
conceptually to a layperson?

Caleb Ochs (02:34):
Sure. You can think of it as like a folder structure
on your computer. You've gotfolders, you got files- all that
fun stuff on a computer. Insideof a folder you have other
folders, maybe files, whatever.
But inside of that folder, youcan have sub folders. So let's
think about the data lake for asecond. If you're up in your
data lake, you're gonna havethat same structure. So you can

(02:57):
have a folder. Then underneathit, you could have other
folders. You could have files orwhatever, but typically you have
other folders. So what thatwould look like is your data
lake folder, and then you doubleclick into that. Then you see
your your source system folder.
So like HubSpot, and then youdouble click into that and it

(03:18):
includes the all your tablesfrom HubSpot, so you can connect
to those files and consume themand use them for your analysis
or whatever you would want.

Brick Thompson (03:27):
So those files that you're connecting to each
of those files represents atable from the source system.
It's in some kind of fancyformat that makes it...

Caleb Ochs (03:38):
Well, you could store it really however you
want. Typically, you would putit into something specialized
for a data lake that compressesit well and makes it easy to
read off of the lake. You canstore other things like XML,
picture,s video CDs, anythingyou want. JSONs, CSVs, anything.

Brick Thompson (03:56):
I like that simple explanation. Obviously,
there's a lot of technical stuffaround how to interact with it.
How do you load it every day sothat you've got up to date data
in it? How do people access it?
Depending on your interface, itcan be as simple as clicking
through. In most cases, you'realso going to build some kind of

(04:18):
a semantic layer. Well, maybenot most cases, but if you're
making the data available toregular business report writers,
you're going to do that. Soyou're going to have a layer of
metadata that someone can hooktheir Power BI or Excel up to or
Tableau or whatever. That makesit really easy for them to
access the data they want fromthe data lake and will have the

(04:39):
relationships to set up all thattype of stuff.

Caleb Ochs (04:44):
A good example of that is some some systems, one
in particular called JD EdwardsEnterprise, or JDEE. The table
names are all these codes.
They're like six digit codes.
It's F5005. Then no one's gonnawant to look at that when you
go, double click into your datalake and you see all these

(05:07):
codes. That's what you would usesomething like that semantic
layer to where you're, you'vekind of extract that bad, hard
to understand, convoluted namingconvention and now you've got
readable names that people candigest and know where to go get
things from. That's a good usecase for something like that. A

(05:28):
tool that you could just connectdirectly to that data lake with
Power BI. The nice thing about,again, keeping things in the
Microsoft stack, the way that wedo it with with Azure, and then
using Power BI and Microsofttools, is that there's just
connectors right to the datalake. Put in your credentials,
as long as you have access, youcan also control access very

(05:49):
granularly. If you're in a datalake you can trudge through the
subfolders and stuff. Power BIjust gives you a connector, you
connect to it, and you get tobrowse the file structure like
that and pull in your data andyou're off to the races!

Brick Thompson (06:03):
So you don't have to have all sorts of fancy
modeling. But it as you said, inthe JDEE example, you're
probably going to want this inthere. You can put other tools
on top of your data lake too.
Some companies are using toolslike Snowflake, databricks, that
type of stuff. In our case,since we're working in Microsoft
synapse, Synapse has greattools, you can even layer

(06:24):
Databricks, Snowflake on top ofthat, if you need to, access is
really easy. Also initialloading is really easy and
inexpensive compared with havingto get to, you know, a fully
thought out Kimball modeltraditional database. You can
load these tables quickly andget what we just call an ETL.

(06:45):
Data pipelines, updating theseevery day much, much more
quickly and for much lower cost.
You actually can have youranalysts accessing data really

Caleb Ochs (06:59):
Now you get all of your data relatively quickly. It
quickly.
can be super powerful. Youobviously need somebody to go in
and actually make sense of it.
But, it's there and it'saccessible. We see a lot where
Power BI, it makes it seem soeasy to just go connect to
different data sources andstuff. A lot of cases that's
true. Once you start gettingdown the line a little bit

(07:20):
further, like publishing thereport up to the to the Power BI
service where you get a schedulerefreshed analysis, and it's not
coming from your desktop. It'strying to make a connection from
some Cloud Source somewhere andyou're getting all these errors,
and you're not really sure why,having a central repository gets
rid of all that.

Brick Thompson (07:41):
In the PE world there's other reasons for having
a data lake, maybe we'll talkabout that next week. How we got
to this was thinking about allthese coming AI tools. If you
have that data lake ready, andyou're using it to do analytics
and using it to do reports now,when those tools show up, you'll

(08:01):
just be able to point it at thatand not sort of be trying to
figure out at that point. How dowe get these tools access to our
transactional systems?

Caleb Ochs (08:12):
Right? I think that's really important. As you
know, there's maybe not amainstream tool yet. I haven't
seen seen one anyway. I think alot of these AI tools are
focused really on the more ofthe consumer, you know, B2C type
stuff. That's gonna fade. It'sgonna be That B2B. I'm sure it's
being worked up somewhere rightnow as we speak. Exactly what

(08:36):
we're talking about. It'd be agood idea to get ready for that.

Brick Thompson (08:40):
Yeah. I mean, we're seeing early examples of
it that looked promising. Forsure. Okay. Well, we'll look
forward to continuing thediscussion next week.

Caleb Ochs (08:48):
Good. Thanks, Brick.

Brick Thompson (08:49):
All right. Thank you.
Advertise With Us

Popular Podcasts

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Therapy Gecko

Therapy Gecko

An unlicensed lizard psychologist travels the universe talking to strangers about absolutely nothing. TO CALL THE GECKO: follow me on https://www.twitch.tv/lyleforever to get a notification for when I am taking calls. I am usually live Mondays, Wednesdays, and Fridays but lately a lot of other times too. I am a gecko.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.