Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:16):
Hi everybody.
Thank you for tuning intotoday's episode of what's New in
Data.
Super excited about our gueststoday we have Everett Berry from
Vantagesh Everett how are youdoing today?
Speaker 2 (00:26):
Doing great.
John, Thanks for having me onthe show.
Speaker 1 (00:29):
Absolutely,
absolutely Everett.
Tell the listeners a bit aboutyourself.
Speaker 2 (00:33):
Sure, so I'm the head
of growth advantage, started
out as an engineer about adecade ago working on AI and
actually computer writtenproducts, and from there I've
sort of transitioned more intothe growth side and developer
relations side and the sort ofone of the main things I do for
(00:54):
Vantage is maintain this sitecalled EC2 instancesinfo, which
is a sort of widely used Amazonpricing comparison site spree
and open source.
And then when I'm not digginginto Amazon pricing weirdness, I
am doing often analyses of kindof interesting things happening
in the data infrastructureworld and so sort of the genesis
(01:19):
of the show, which isDatabricks and some interesting
stuff that Microsoft has beenpushing out, is one of those
items.
Speaker 1 (01:27):
Excellent and just a
testament to your work.
I mean I first discovered yourwork on EC2 pricing when I was
just individually searching forthe price of the instance and
equivalent Intel instance on AWSand your work was the first
search result that actually gaveme the answer like very quickly
(01:47):
, right, If I went through AWS'website there's a little bit
extra clicking, you have to gothrough a little bit of
marketing on it on top.
But I thought that your workthere was just super easy to
sift through and do comparisonsand very discoverable as well.
So definitely props to you andI'm sure everyone's come across
your EC2 pricing page as well.
Speaker 2 (02:09):
Yeah, I appreciate
you saying that, and actually
the page has been around foralmost 12 years at this point,
so it was really developed bythe community of AWS developers
back in the day, because there'snot too many places where you
can get the specs of theinstance alongside the pricing
of it.
But I've been it's really beena great privilege to take that
(02:30):
over and maintain that for folksand hopefully improve it as
well.
So, yeah, I appreciate both.
A shout out.
Speaker 1 (02:36):
Yeah, definitely,
definitely.
So everyone appreciates thatpage.
On the AWS instance, it's greatto hear it's open source as
well and the community ishelping maintain it.
The reason you're on the podtoday is you have a super
awesome blog that's making therounds.
It's titled Microsoft FabricShould Databricks Be Worried,
(02:57):
and it's essentially diving intoa comparison between Microsoft
and the Databricks and wantedyou to break down that topic a
bit more, but at a high level.
Can you just describeDatabricks and Microsoft Fabric
for the listeners?
Speaker 2 (03:09):
Yeah, absolutely so.
Databricks really got its startas a managed Apache Spark
provider and there's sort ofvery many layers to this.
But Apache Spark is a modernversion of like a Hadoop
workflow, where you have a tonof data that you're trying to
process and you distribute theprocessing which, let's say,
(03:34):
it's like a complicated SQLquery.
You distribute that out over acluster of compute notes and
Apache Spark orchestrates whichnodes to do what were, how the
results should get combined andso forth, and that way of doing
things was pioneered, kind of,you might say, like 2008, 2010
(03:56):
area, and was made open sourceas an Apache project.
But actually managing theinfrastructure in order to do
that was pretty challenging, andso Databricks is a commercial
version or really started as acommercial version of Apache
Spark, and now it does anenormous number of things.
It has notebook features, ithas BI reporting, it is
(04:22):
increasingly known for itsmachine learning capabilities
and I think OpenAI recentlytalked about their massive Spark
cluster.
So I don't know if they'reDatabricks customer or not, but
Databricks has kind of become ago-to place for people to run
very large data workloads andhas a lot of enterprise
(04:43):
penetration and it's oftencompared to although not always
correctly.
It's often compared toSnowflake as being a place where
enterprises can run very largequeries and data tasks.
Speaker 1 (04:57):
Absolutely.
That's a great overview.
Certainly a lot of enterpriserepresentation of Databricks
users.
I did present a data and AIsummit with American Airlines,
for instance, and they power alot of core operations through
it and there's so many examplesof that.
And one other interesting thingis that Databricks is an Azure
(05:17):
first-party service, so there'sAzure Databricks which runs in
Azure, but it's also sold byMicrosoft, which is interesting,
and you break down that part inyour blog as well.
So, now that we know a bitabout Databricks, what is
Microsoft Fabric?
Speaker 2 (05:36):
Yeah, and just to hit
on the relationship between
Databricks and Microsoft,because it powers a little bit
of the entry of the post,databricks is and Microsoft have
one of the deepest integrationsthat I have seen between a
cloud infrastructure providerand a cloud service.
So when you are on theDatabricks doc page and you
(05:59):
click over to Azure pricing, itactually redirects you to the
Microsoft docs page forDatabricks and, similarly, the
company at Work for Advantage.
We have a Databricks costintegration and in order to
access Databricks costs, youhave to connect Azure cost
management APIs and so this isvery unique.
(06:22):
In AWS, for example, to accessDatabricks costs, you kind of
access, say, s3 bucket that isprovisioned by Databricks.
So Databricks and Azure havethis very deep integration and
very long history and some ofthe intrigue of the post is that
Microsoft has this, in my view,launched what amounts to a
direct competitor now toDatabricks, so basically
(06:49):
offering that Azure users couldconsider if they were trying to
do basically Apache, spark orkind of Databricks style
workloads, and it works verysimilarly to Databricks.
That has some of the Lakehousefeatures that we can talk about
that Databricks has.
It provides a unified computerruntime across notebooks, across
(07:12):
SQL queries, and it integratesPower BI and Synapse, which, for
folks that have been in theMicrosoft ecosystem for a long
time, most likely those are thetools they're using for a lot of
their data needs, and the gistof the post is that Microsoft
has enabled their users to havea Databricks style experience.
(07:33):
There's a lot of technicalitythere.
In many ways, the experience isdegraded from what Databricks
offers.
It doesn't have a lot of thesame machine learning features
and it has a much different end.
Could be better or worsedepending on your point of view
pricing model, but the otherlong and short of it is.
(07:53):
I Do believe there's a realchoice now between Azure
Databricks and Microsoft Fabricand and it's Rising in some ways
to see that just because thatpartnership has been so deep for
so long.
Speaker 1 (08:05):
Yeah, and just to
dive into that point, for those
of you who are Watching this onYouTube, you'll see the feature
comparison, but those of youlistening on a podcast, we're
pulling up ever its blog hereand Essentially it's a feature
matrix.
Here where you know, there's alist of about what is it?
One, two, three, four, aboutten or so features, and fabric
(08:26):
has all but two of them.
Right, if you go down SQL,python, data science, notebooks,
managed Spark, data engineering, serverless, sql, ml, flow at a
high level, databricks andfabric both support both the
ones that Databricks has thatfabric does not have at this
point, or just dealt the livetables and models serving.
(08:46):
So this just really illustrateswhat you're describing here and
this is why you're saying thatfabric can be a competitor to
Databricks.
Speaker 2 (08:56):
Yeah, that's right I
there was some discussion on
LinkedIn about the actual sparkimplementation and this is kind
of an interesting side topicwhere Databricks has all sorts
of of optimizations on top ofspark, including this thing
called botan, which is sort ofan optimized runtime that
specifically executes on likevectorized processors, like
(09:20):
Intel AVX, that that Microsoftdoes not currently offer.
But I do think, like at asurface level, that the two
services are are directlysimilar and what it may come
down to is really just likequality of product offering.
I think there are a lot ofdoubts that Microsoft, in the
way that they kind of cobbledtogether power BI and Synapse
(09:42):
and An Azure one Lake, they'recalling it into this one
platform whether they canactually deliver that kind of
same.
You know unified powerfulexperience that Databricks users
currently have.
But Our experience at at build,which is Microsoft's one of
their main developer conferenceswhere fabric was announced, was
that within the Azure ecosystemthose folks may not, may not
(10:07):
care as much that some of theworkloads, some of the more
advanced stuff like Delta livetables, are supported or not if
they get access to thisDatabricks style way of doing
things, and so it.
To me it's.
It's a real question, and youmight even compared to and I
don't want to go too far, butwhen Microsoft launched teams,
(10:28):
that really did eat into a lotof slacks, market share and, and
so it's Possible that a similarthing could happen here.
That said, the conclusion ofthe post is that when you look
under the hood a little bit toeach implementation and what
features are available,databricks, I think, is a pretty
clear winner, but just the factthat the the data but style way
(10:54):
of doing things is nowavailable in a native Microsoft
fashion Is a big step forwardand and maybe enough to win
customers.
Otherwise, databricks wouldkind of have to themselves Fully
, and there's always a nuanceand enterprise software.
Speaker 1 (11:08):
I mean, I could take
two products that seem like a
total Apple to Apple's.
Comparison turns out there likefor different use cases, and
you know, I think companiesalways have to directly evaluate
which one is best for them.
But another part that you diveinto in your blog posts, and
which is actually a pretty Bigtopic, is the pricing.
So how does the pricing ofDatabricks compare to Microsoft
(11:31):
Fabric?
Yeah, so in both cases.
Speaker 2 (11:37):
There's really not a
great way to estimate the costs
ahead of time.
Apache Spark, just by nature ofits execution, is it's sort of
up to the runtime, how longthings will take, what gets
executed and so forth.
What Microsoft has done iscombined all the pricing related
to Infrastructure needed withone exception, which is storage.
(12:00):
But they combined all this intoa SAS model where there's tiers
of capacity that you canreserve, so you can reserve a
certain number of what they whatthey call fabric SKUs, which is
basically an amount of computethat can be used for your job.
So let's say, you know you're,you have a SQL query, it runs
for an hour, maybe it consumes,like you know, 48 fabrics use or
(12:20):
something, and that's andthat's charged In a.
In a SAS consumption based model, databricks has a extremely
different pricing model, whichis Databricks deploys within
your own cloud infrastructure.
So as a user, I pay for theinstances or the bn's that I'm
(12:41):
running, I pay for theNetworking costs and other
infrastructure associated withthat, and then I have a dvu cost
, which is a a DatabricksManagement fee, if you will
their their fee for kind oforchestrating this, the SQL
query that I that I pay directlyto Databricks.
(13:01):
And so Microsoft's argument isthat their pricing is Simpler.
It's it's one tier.
You kind of scale up and down.
What we're going to do is we'regoing to scale up and down what
.
What we believe to be true isthat it may actually be too
simple in some ways.
In fact, some interestingquotes from Microsoft's
(13:23):
documentation on fabric wherethey say you know that the best
way to understand your costs areto just like run some workloads
and estimate from there.
We don't have a good calculator, we don't have a good way to
tell ahead of time which whichtiers are going to be in.
And so it's my belief thatactually many data teams would
prefer to have a greater degreeof control over which
(13:45):
infrastructure is used and andkind of where the costs go,
which database provides.
But Again, like if I'm an azureuser, I may appreciate kind of
the simplicity of the fabricmodel, but it is one of those
models where I would saysurprise costs you know,
surprise bills are arepotentially more of a concern
(14:08):
than than in data bricks case.
Speaker 1 (14:11):
Makes sense and you
know we're For those of you on
youtube you see that we'resharing it, but on the podcast
You'll just have to listen.
In Databricks, as you mentioned, ever, it does have this
granular cost per dbu.
On azure Across all theseservices right, like model
serving data, brick, sequel,jobs, compute, etc.
So, like you said, it's moreflexible and gives you a little
(14:34):
bit more control Around.
You know what services you wantto use and you can do some more
fine-grained cost optimization,whereas an azure fabric they're
.
They're just giving you these,these skews, right, and the
skews sound like they have allthe features included and they
(14:54):
should tell you how many CUs youhave power, bi v cores, spark
cores, all that's kind of thrownin with a flat hourly cost.
But it's up to the user to tryit with their workloads and see,
you know, is it meeting the,the performance and data latency
(15:15):
requirements which I'veevaluated a lot of cloud
software as well, and ultimatelythat's.
I think that's pretty standardnow.
But you know I've doneevaluations of several cloud
data warehouses and everythingthat we did with like a
self-serve compute model.
You know the costs do changeover time, over adoption.
(15:38):
It's hard for teams to estimatelike the one to three year
usage.
With these you know cloud usagebased pricing models and every
year you sort of have to coursecorrect, renegotiate and see
what's actually being deployed.
Now, if we were to just simplifythis for the, for people
(15:59):
listening, would you say thatokay, between the two, who has
the simpler pricing model?
Speaker 2 (16:06):
So it's a harder
question to answer.
On the surface Fabric issimpler because there's a set of
tiers and everything iscombined into one.
In my view that makes itactually more difficult because
for example, with Databricks, ifI'm running compute jobs that I
(16:26):
know that that is separated outfrom the Delta Live tables jobs
I'm running, or I know thatit's separated out from
notebooks.
With Fabric you can have users.
You have a data scientist whomaybe has notebooks that they're
running that contain thesecomplex queries.
You might have a whole separateteam or set of users that is
running kind of individualanalytical queries or Python
(16:49):
jobs or machine learning jobs.
All that is mixed into one.
And so if I'm sitting at theFabric console then certainly
I'm just seeing kind of one costthat's coming through and for
some teams that might bepreferred.
But it's a mixture ofeverything and so there's less
of an ability, which is in thatDatabricks offers, where you can
(17:11):
say look, 60% of our costs aremachine learning, 30% are data
science and 10% are modelserving.
I can kind of go in and furtheroptimize, I can change the
instances and so forth.
So Fabric's combination ofeverything into one set of SKUs
and one kind of tiered pricingsystem is simpler if, at the end
(17:35):
of the day, that is thepreference of the company.
But my view, kind of living ina cost optimization and cost
visibility world, I tend toprefer a Databricks model where
different types of workloadshave different pricing and
there's greater granular controlover which instances and what
infrastructure is used to runthat stuff.
Speaker 1 (17:56):
It totally makes
sense.
The flexibility is always goingto be something you can
leverage to tune your costs overtime.
Now, so that's comparing thepricing models, but just what
about the overall costs?
Which one do you expect to belower if you were to pick one or
the other?
Speaker 2 (18:14):
So it's again hard to
say.
I do think that many Azureshops will find that Fabric's
probably offers savings versusDatabricks.
There's a couple reasons thisis the case.
One main one is that you canuse your Power BI premium
capacity, which is a little bitof the previous model of running
(18:37):
Power BI jobs.
You can use that to pay forFabric and so, and compared to
those capacity reservations,fabric is actually more of a
finer grained consumption basedmodel.
So for teams that maybe areending every year with leftover
(18:57):
Power BI capacity or have a lotof unused capacity for their
existing Microsoft workloads, Iactually do think they will get
a lot of value out of Fabric andprobably see cost efficiencies
that as a result.
Another example of a Microsoftor Azure user that would be
dealing with that would be ateam that's running multiple
(19:18):
different types of Synapseworkloads, and Synapse would be
kind of the Azure data warehousesolution alongside Power BI.
So those teams probablyshouldn't see some good savings
too.
The folks that are going to seeit cheaper, more cost efficient
workloads on Databricks arepeople with an ability to
(19:39):
optimize the jobs that they'rerunning and perhaps change the
instance types that they're on,perhaps really control the
amount of data that's used.
So teams with kind of adetailed understanding of what
their workloads are and maybeperhaps less of a inclination to
run a lot of experiments, aregoing to see cost savings on
(20:02):
Databricks.
So yeah, that point is onereason why I do see this as a
little bit of a threat toDatabricks, where otherwise they
might be able to win some Azureworkload.
They're not going to becompeting with teams who are
saying, well, if we migrate theSynapse and Power BI things that
we're doing on the Fabric,we're actually going to see cost
savings and we don't have tochange the existing tool set
(20:25):
that we're employing.
Speaker 1 (20:27):
Do you see a case for
using both Azure Databricks and
Microsoft Fabric?
Speaker 2 (20:32):
So Microsoft says yes
.
They have a couple of blogposts where they are talking
about using Fabric withDatabricks, and there's one case
in particular that theyhighlight, which is where you
have data that's kind of siloedbetween your different tools.
So one great feature of Fabricis that it combines the data
(20:56):
storage from Power BI, fromAzure Blob Storage, from other
data lake solutions available onAzure, into one unified pool,
and you can connect that pooldirectly to Databricks.
And so the use case thatMicrosoft talked about, which
seems very legitimate to me, isthat, by unifying your existing
(21:19):
Azure data workloads insideFabric, you can then connect all
that data to Databricks and useDatabricks features to practice
on it to run workloads on it.
So that makes sense to me.
Outside of the kind of unifiedone lake situation, I'm not sure
that a lot of teams would useboth.
(21:41):
I'll make one exception to that, which is the machine learning
side of things.
It definitely appears thatDatabricks has a strong lead
here, and I would think a teamthat has kind of a continuous
data pipeline or maybe they'recontinuously training new models
, they would want to employDatabricks and they may end up
using Fabric for more of atraditional data warehousing
(22:04):
solution.
Speaker 1 (22:05):
It definitely makes
sense and data bricks itself.
There's no question.
It's a platform in a true sense, where it's very flexible.
You can do general purpose dataprocessing, compute, etl.
Machine learning Fabric hasMicrosoft Fabric is also a
platform.
It does, with the Power BIaspect, seem like it's closer to
(22:28):
the business intelligence side.
Maybe it's businessintelligence plus, where it's
including ETL and simplifying alot of the workflows.
With the BI use cases in mind,would you say that's a good
assessment.
Yeah, absolutely.
Speaker 2 (22:46):
Power BI is strongly
there and actually this is an
area where data bricks ispunching back, if you will.
So their most recent majorannouncement, in my view, was an
expanded set of BI capabilitiesand dashboarding capabilities,
and in some ways those featuresmake data bricks look a little
bit more like a tableau, justlooking at the UI for it.
(23:07):
So yeah, short answer is yes.
The people moving from a moretraditional BI and reporting
place into a more moderndistributed processing spark
place, going through Fabric todo that transition, seems very
logical to me.
(23:27):
There's.
It'd be interesting to actuallyknow from data bricks how they
kind of sell against Fabric inthat scenario, but that does
seem to be how Microsoft haspositioned.
Speaker 1 (23:39):
Totally makes sense.
And I, who would you say?
The real winner is from thefact that Microsoft launched
Fabric.
It's Microsoft.
Speaker 2 (23:50):
The thing to
appreciate with even Azure data
bricks is that theinfrastructure to run data
bricks is being run in thecustomer's account, and so
that's generating Azure costsand revenue for Azure, even
while that's happening.
With Fabric, not only is theinfrastructure being run in the
customer's account although youdon't see that in the pricing
(24:11):
model but also the DBU that thedata bricks fee is also going to
Microsoft, and so there's nodoubt that this is a competitive
offering.
It's something that I thinkdata bricks should be a little,
you know, at least look closelyat.
I don't necessarily expectMicrosoft to win a lot of bake
(24:34):
offs with data bricks right now,but over time, if they keep
investing in the platform andthey have sort of an entire
microsite and, instead ofdeveloper relations, people
dedicated to it, you know, Ithink it does pose a real, a
real threat, and I also think itmay help Microsoft retain users
who, ultimately, were going tobe pulled to data bricks, just
(24:56):
because the data bricks way ofdoing things, which Fabric is
mimicking, does seem to be kindof the new way of the world for
these larger enterprise datastacks.
Speaker 1 (25:10):
Totally makes sense.
And the other thing that I'mseeing is, you know, in the year
of 2023, it's all aboutefficiency, Whereas in 2019,
2020, we were kind of talkingabout the growth of the modern
data stack and implementing themodern data stack.
What are all the tools youshould adopt?
So would you say.
You know, right now we're onthis wave of cost reduction
(25:32):
across.
Speaker 2 (25:34):
I, yes, and the
quarterly numbers from the major
cloud providers, which we alsodo some analysis on, and these
cloud cost reports, seem tocorroborate that.
I think the modern data stackin particular is going through
its own kind of minioptimization wave, and the same
(25:57):
way that cloud infrastructure ingeneral is going through that.
I think data bricks actuallymay be more immune to that than
a solution like Snowflake,because data bricks offers
customers a lot of flexibilityaround how the workloads get run
and which instances are chosenand so forth.
But still you have thesituation where Microsoft's main
(26:17):
, one of their main sellingpoints for fabric is it's a more
cost effective way to operatedata workloads on Azure.
So, yeah, the short answer isyes, I don't think we're seeing
as much of an expansion of themodern data stack suite of tools
as we were.
That said, there's one kind ofcounter trend to this, which is
(26:41):
the emergence of AI.
Every AI workload is powered byhuge amounts of data,
particularly data that needs tobe trained on and changed a lot,
and so if there's kind of a onemaybe saving grace for the
modern data stack on the nextyear or two, it's just the fact
that clean data is almosteverything with these models and
(27:06):
data bricks seems very wellpositioned for that, and they
acquired, of course, mosaic ML,which is part of that story.
So, in general, traditionalworkloads optimization is king,
but AI may be kind of a countertrend that lets the party
continue, if you will.
Speaker 1 (27:27):
Absolutely.
Cost reduction is one area, butAI is another big category
that's getting a lot ofattention, a lot of investment.
I've seen this as well, wheredata teams are being asked to
prioritize AI in their short tomedium term roadmaps.
There's all these questions nowin enterprises who owns the AI?
(27:51):
Is it going to be its ownseparate function?
Data teams are saying look,whoever you bring in for AI has
to rely on us to actually getthe data and to actually
operationalize it with capturingthe embeddings, launching the
vector databases, whatever it isserving the models, et cetera.
(28:12):
A lot of data teams are viewedas adjacent to the work they're
doing and adopting AI as more ofan evolution rather than a
revolution, and extending theirfunctions.
I think cost reduction comesinto play there too, because if
you're spending so much moneyjust on ETL, just processing
(28:36):
data, just with high pricecompute, where's the budget for
AI going to come from?
Yeah, seeing a lot of teamsoptimize compute on some of the
more basic data transformationtasks, so you can actually
invest in AI, which is its ownsignificant area, both in terms
(28:59):
of people and the underlyingcloud infrastructure that you
have to deploy.
Speaker 2 (29:04):
It's kind of a budget
question, right?
If I'm doing my 2024 budgeting,which many people are doing
right at this moment, do I justhave a budget category that's
for AI and, as a datapractitioner, maybe I can sneak
in some budget for data toolsinto that category, even if
(29:25):
those tools serve otherfunctions of the business.
I think it's actually a greatopportunity for teams to in some
ways reset and remake thearguments.
There's no doubt that,especially if you believe the
foundation model idea, that manycompanies will rely on the data
(29:45):
they have more than theirability to invent new types of
transformers.
In that sense, the data teamsand the work that they do with
these tools is going to be thekey, maybe even more so than
necessarily the types of workthat maybe a machine learning
researcher at OpenAI would bedoing, which would be a little
(30:06):
bit more focused on themechanics of the model versus
the data that is provided to it.
My sense is that a lot ofcompanies will, in fact, just
make investing in AI a priority,and that budget should trickle
down into the data teams andgive them that they play such a
primary role.
Speaker 1 (30:27):
Absolutely, everett.
This is an area that you workin heavily.
Vantage is, of course, a leaderin cloud savings and automating
those savings.
What's your general advice todata teams on FinOps and making
sure that they're within theconfines of their budget but
still hitting their actual goalsin terms of business and data
(30:50):
projects?
Speaker 2 (30:52):
We have a couple of
tools that look at this.
One that's popular is topsnowflake queries by cost On the
data side.
Really, the key is that we'reseeing is more around visibility
and cost allocation.
Data teams have a lot ofinternal customers typically and
(31:15):
those internal customersgenerate different levels of
costs.
You can imagine the revenuereporting team or the folks
dealing with customer datarequests.
They may generate an outsideshare of costs versus other
internal holders.
I think the biggest thing isactually just being able to say
look, we have a six-figure spendwith snowflake or data bricks
(31:38):
this year and portions of thatare being allocated to these
stakeholders.
Generally, I've seen theargument made where if a certain
stakeholder or leadership-styleperson is using up a lot of
that data budget, they're goingto be pretty okay with
(31:58):
maintaining or expanding it,versus maybe the idea that the
data team is just doing allthese experiments that are
generated in cost and there's nodirect tieback.
I think there are tools outthere and we have some of them
around optimizing the operationof the warehouse or of the data
bricks installation.
(32:20):
The number one thing that I'veseen be successful for teams is
just having greater visibilityand allocation into where that
spend is going.
Then typically there's a clearROI argument to make.
There's a case that you reallyare advancing the business via
this contract and your snowflakebudget might be this, but the
(32:45):
results you've generated are 10times this.
We stress more on theallocation and visibility side,
but it is an interestingquestion.
Actually, the fabrics and databricks piece gets into this of.
Maybe I do have more ability tooptimize my infrastructure with
data bricks versus fabric andtherefore that's a decision
point and which tool I go with.
Speaker 1 (33:08):
Is there anything
exciting that you're working at
advantage that you'd be willingto share with the listeners?
Speaker 2 (33:13):
Yeah, I appreciate
the question.
The major initiative we've gotrolling this sort of wrap up
2023 is in the culture synopsiscode.
This is a set of tools andtechniques to automate the
process of reporting on costsand allocating costs across the
organization.
The brass tax tacticalexplanation of what it is is a
(33:40):
set of terraform modules whichallow you to provision the cost
reporting in the same way thatyou provision your main
infrastructure For example,spinning up a new data service
for a new application and rightnext to the terraform that
provisions that infrastructure.
There's terraform that actuallycreates cost reports and
(34:01):
vantage and creates things liketagging and filters alongside
the infrastructure beingdeployed.
In general, this is actually oneof the main challenges of
FinOps is just keeping thereporting up to date with the
changes that are happeningacross the company's
infrastructure.
Imagine a scenario wherethere's a thousand or more
(34:23):
engineers and there's thingsbeing spun up and down every day
.
Generally, infrastructure iscode, be it terraform or Pulumi
or CDK is a way that companiesmanage that.
We're trying to bring some ofthe FinOps workloads that exist
in vantage directly into thosemodules so that teams can
(34:47):
automate the bookkeeping that'sassociated with keeping their
cost up to date.
In general, I think this isexciting for FinOps
practitioners too, because itmeans that they can spend less
time chasing down tagging andtrying to figure out just what's
happening in the infrastructureand more time on the actual
optimization and executive levelreporting tasks that they have
(35:09):
to handle.
We're very excited about that.
It's also cool to me and Ihaven't dealt with a ton of
products where you can directlyprovision UI elements of the
product from terraform the wayyou can with this spin-off as
code thing.
I'm really excited to see thathopefully spread throughout the
(35:31):
industry and in some waysmodernize the it might be like
chores and a little bit of thetreasury of FinOps and automate
big swaths of that so people canspend most of their time on the
value generation side, which isdefinitely an optimization, and
then allocation of costs andthings like margins and showing
(35:55):
back ROI and things like that.
Speaker 1 (35:57):
Excellent.
Yeah, that is super excitingand sounds like it really
simplifies and gives a lot offlexibility into the cost
savings process there for dataand ops teams.
Everett Berry growth atVantagesh Great having you on
the show today.
Where can people follow alongwith your work?
Speaker 2 (36:14):
Yes, I'm on Twitter,
as many of us are.
It's at RedX with 3Ts.
I'm on Twitter.
This particular post did wellon LinkedIn too, so you can find
me on there.
In general, I spend most of mydays chatting or conversing
about cloud infrastructure andreally get a lot of joy out of
(36:35):
that.
I would love to keep theconversation going with anyone
who's interested.
Speaker 1 (36:41):
Excellent Everett.
Thanks so much for graciouslysharing your insights through
your blog posts and hopping onthe podcast today, and thank you
to everyone who tuned in.
Speaker 2 (36:50):
Thanks, John.