Context Engineering as a Discipline: Building Governed AI Analytics

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:11):
Hello, and welcome to the Data Engineering podcast, the show about modern data management.
Data teams everywhere face the same problem. They're forcing ML models, streaming data, and real time processing through orchestration tools built for simple ETL.
The result, inflexible infrastructure that can't adapt to different workloads.
That's why Cash App and Cisco rely on Prefect.

(00:33):
Cash App's fraud detection team got what they needed, flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows.
Each model runs on the right infrastructure,
whether that's high memory machines or distributed compute.
Orchestration is the foundation that determines whether your data team ships or struggles.

(00:53):
ETL, ML model training, AI engineering,
streaming, Prefect runs it all from ingestion to activation in one platform.
Whoop and 1Password
also trust Prefect for their data operations.
If these industry leaders use Prefect for critical workloads, see what it can do for you at dataengineeringpodcast.com/prefect.
Are you tired of data migrations that drag on for months or even years? What if I told you there's a way to cut that timeline by up to a factor of six while guaranteeing accuracy?

(01:22):
DataFold's migration agent is the only AI powered solution that doesn't just translate your code. It validates every single data point to ensure a perfect parity between your old and new systems.
Whether you're moving from Oracle to Snowflake, migrating stored procedures to dbt, or handling complex multisystem migrations, they deliver production ready code with a guaranteed timeline and fixed price.

(01:43):
Stop burning budget on endless consulting hours. Visit dataengineeringpodcast.com/datafold
to book a demo and see how they turn months long migration nightmares into week long success stories.
Your host is Tobias Macy, and today I'm welcoming back Nick Schrock to talk about building an AI analytical system that keeps data teams in the loop in the form of Compass. So, Nick, can you start by introducing yourself for people who haven't heard any of your past appearances?

(02:09):
Yeah. Sure. And, thanks for having me, Tobias. It's always a pleasure being on. So, yeah, briefly, I'm Nick Schrock. I'm the CTO and founder
of DAXR Labs, which is the company behind
DAXR, which is a,
open source
data orchestration platform,
and Daxter Plus, which is our commercial hosted product on top of that, and now kind of an additional product, which is called Compass,

(02:32):
which I'm super excited to talk about. Yeah. Before that, kind of I cut my teeth at Facebook engineering, and the thing I was best known for was being one of the cocreators of GraphQL.
So that's kind of my story.
Founded Daxter
in 2018,
so a while ago now, and but kinda kinda really got the company off the ground 2019, hired my first employee then. And we've been

(02:55):
working really hard for a long time and have an at scale open source project and a really healthy commercial business and looking forward to many more years
of success.
You've been running Dagster for almost as long as I've been running this podcast.
It's true.
Your podcast is actually one of the major ways I got up to speed on the domain. In particular, the episode you did about DataOps was Chris Berg. That was, like, kind of a real unlock for me. So I feel like you and I have kind of been on the journey together in some ways.

(03:27):
Absolutely. And it's it's been a crazy ride over the past,
what, eight years now. So Yeah.
So I guess the next stop in that ride is Agentic Systems.
And so because you're working in the technology space, you're obligated to build an Agentic system. So I'm wondering if you could just give a bit of an overview about

(03:47):
what your thoughts are on the application of
agentic systems to data analysis
and some of the ways that you thought about the approach to Compass that keeps data teams in the loop without just leaving them on the sidelines
and letting the AI run rampant over all of their hard work.
Yeah. It's been a fascinating

(04:07):
journey,
actually.
I think both me and
Daxter Labs as a company has been fairly conservative when it's come to
AI and agentic systems up until now. You know, last summer, I wrote this piece.
That was a blog post about,
what I called the rise of medium code and the properties of a software system

(04:29):
that needs to happen or to be an amenable target for AI cogent. And it really focused
on minimizing slop
and having a technical blast radius so the AI can't do much damage to your system and
all that. So I've always thought about it in those terms,
but I was always
a little skeptical about how good the agents could get. And it's really their the progress has really exceeded my expectations in the last year. And I think really a huge I didn't realize the time, but I think a huge release was this release in February

(05:03):
where
Anthropic released Sonnet three seven, I think it was, and Quad Code in the same release.
And those two things was a simultaneous
innovation at the model layer, but, also, I think maybe even more importantly, the application layer over that model layer. And that kind of moment

(05:23):
and the right afterwards was a huge wake up call for me that was like, oh,
these systems
are super ready for prime time now if you apply the right tools and techniques.
And then that really momentum has been building. Actually, you know, in in June,
now this term kind of became part of the ether context engineering.

(05:47):
Toby looked up Postapata, and then it was canonized
by Carpathi.
But it really described how,
you know, you can
you know, it's a kind of a rebrand of prompt engineering, but it it kinda describes how you can programmatically
inject context in the right place in the right time to the right model.
And that mentality

(06:07):
really clicked with me in terms of, like, oh, you know, me,
the lowly product and infrastructure engineer who doesn't know hired the higher order math necessary to kinda build a foundation model. Like, I can really participate in this in a super first class way. So that's kind of the context of my journey. At this point, I'm very, you know, AI pill, I would say, in terms of I believe that we are witnessing the simultaneous

(06:33):
disruption
of multiple layers of the stack simultaneously
in a way that we've never experienced before. So AI is revolutionizing
the way we build software, the way we structure infrastructure,
the way our stakeholder relationships work,
and also
the the consumption layer. Right? ChatDTP has been, like, a dramatic change in the way that we interact with computer systems.

(06:57):
And that has not really reached the enterprise at all either, which is a very interesting topic of discussion.
So I think we're at this massive
way. And it is you know, you you kind of alluded to it how, like, okay. You're in software, so you have to be thinking about agentic code.
And, unfortunately,
kind of that is true. And I think some people are more obnoxious about it. But it's kind of like

(07:20):
ignoring agentic AI would be like ignoring the Internet
in the nineties, you know, and not really thinking about how that's gonna impact your system.
And, you know,
it comp the Internet completely revolutionized all domains of computing. Right? Even the ones that weren't putatively you know? It kinda started with consumer, but then all our infrastructure changed too. So I think this is a similar

(07:43):
a similar wave. So I think it's extremely
exciting,
and I'm just really enthusiastic about the future.
And to help frame
the rest of the conversation,
can you describe
the scope and purpose of Compass and some of the problems that you're trying to solve with it?
Yeah. Without going into features or technologies,

(08:05):
I think the
the highest level the problem we are trying to solve in sort of human terms
is
to completely
restructure
the relationship
between a data platform team and their stakeholders.
Meaning that kind of right now,
I think that data teams feel like they are cogs in a machine,

(08:28):
that they are cost centers,
that they are there to do a job. Business stakeholders ask for data. Business stakeholders ask for dashboards. But then you're kinda disconnected from those business business users because
your work is intermediated by these tools,
which are often not that pleasant to deal with. Like, you know, BI tools being an example. I often joke that BI category, it it feels like it was invented by Dostoevsky

(08:54):
because, like, all BI tools are terrible, but they're all terrible in their own ways. And so what we really want to do is rather than have you know, you you mentioned the term self serve. Rather than think about it as complete self serve, we wanna redefine the relationship between the stakeholders so that the data team is
collaborating with the business stakeholders in real time in a highly positive way where instead of be being viewed as a cost center, they are the face of the value. And so they're collaborate collaboratively working

(09:25):
with their stakeholders, and they're empowering
empowering way more of those stakeholders.
Now that is the problem we're trying to solve. And in the end, that means
much more accessibility
to data, and you can leverage your data platform to do more things in the organization. They're bore thereby increasing the value. K. So that was like a
value. K. So that was like a whole a lot of stuff that I just talked about. But we're redefining the stakeholder relationships such that the perceived and real value of a data platform is higher in the organization.

(09:54):
Now how do we do that? You know, Compass looks fairly
innocent
at first blush.
It is a Slack native
experience
where you can interact with your data in natural language.
It's processed by AI, which sort of acts like a junior analyst, and it can interrogate your data warehouse and do interesting analyses.

(10:18):
That is the user experience. But it ends up
being fairly, I would say, transformative, dare I say, revolutionary,
and it has been internally
because you have the stakeholders
interacting with both this agentic tool, the the the analyst. But then
because there's a agent analyst, the the data team members that are in there in that Slack thread with the business stakeholders,

(10:44):
they're no longer analysts.
They act much more like data stewards.
They're, like, guiding the user to do the correct analysis, and then they manage
the context.
And they manage the context store so they can govern the AI in a very scalable
fashion. And then the business stakeholders, they never leave Slack. There's data vis right there. The stakeholders

(11:08):
can make data requests.
They can
request context corrections. Often, the AI just figures out how to do that for them. So they plug in all sorts of workflows. They can schedule these analyses on a regularized basis.
And in our demos, we really we almost brag.
Right? That's like, you're not gonna see a web UI during this entire demo, and that is super deliberate. Because who wants to learn another web UI? You, like, get bounces web UI. You have to auth to it. You have to learn a completely new information architecture.

(11:37):
Right? You have to learn completely new concepts, and it bounces you out of your collaborative zone. You can no longer go, like, at mention people, etcetera, etcetera. So, yeah, that's kind of the approach. It's a, you know, it's a Slack native natural
natural language analytics experience that is both collaborative,
governed, and natural language in AI driven by AI.

(12:00):
So the common challenge when dealing with all of these agent based systems, as you already pointed out, is this challenge of context engineering
or the alchemy of turning
raw bits into useful information is
the entire purpose of data engineering, to be very reductive about it. And so those two things are

(12:23):
intention because
the data engineer doesn't want to
forego
their purpose and hand it off to an AI
and particularly if they have any pride in their work because they know that the AI isn't going to do an appropriate job of understanding
the business needs, the business context, all of the hardwood knowledge that they've already encoded into the data assets that they're building. Whereas

(12:46):
the consumers of that data, to your point, don't wanna have to deal with learning all the information architecture. They don't want to have to dig through all the docs or go through all the pipelines to really understand what it is that they're actually looking at. And so I'm wondering if you can just talk to some of the ways that you're thinking about that context management
and the handoff between
the data teams who are doing all the hard work of bringing all this information together and hydrating it with that business context

(13:12):
and the ways that the agentic analyst is able to actually retrieve and interact with that context to be able to understand
how to map the probably very vaguely worded request from the stakeholders
into a concrete plan of action and means of discovery and enumerating
all of the information that's required to be able to fulfill that request.

(13:35):
Right. So there's a lot there. Just trying to start the thesis apart here.
So I guess the
dispositionally
and because I also think it's how the world
works is that I think of the AI as a bicycle for your brain as opposed to a replacement
for it. And in some ways, actually, almost in every way,

(13:56):
these AIs make judgment and taste
that much more leveraged.
Because if you have good taste and judgment,
you can get the AIs to do an extraordinary amount of work on your behalf,
that's high quality. But if you don't have that, then it can you know, I call it a technical debt super spreader. It can copy bad patterns and it can go off the races and hallucinate

(14:18):
lord knows what. So that's kind of my starting place is that we build tools
that keeps humans in the loop, that are governed, and that accelerate
the work and amplify
the work of subject matter experts rather than sort of eviscerating it. So that's kinda that's kinda my starting.

(14:38):
I, so where do you want me to take it next? So that's, like, kind of, like, high level philosophical.
So I think the interesting bit is I'll admit to everybody listening, I've already witnessed the demo of Compass, so I already
know a lot of the details of how this is operating. So these are somewhat leading questions and a little bit of inside baseball. But my understanding,
the Compass utility relies on a repository

(15:01):
of context artifacts for being able to
understand how to map some of the semantics of these analytical requests into the actual data assets that are available with these data assets, at least in its current formulation, largely being restricted to a data warehouse environment for being able to create and execute SQL queries and, warehouse environment for being able to create and execute SQL queries. And so I'm wondering if you can just talk to how you're thinking about the initialization

(15:25):
of that context repository
from that set of
tables and data assets that already exist in a manner that isn't just a lot of extra busy work on the behalf of the data team, but provides all of the necessary
information and guidance to the agentic system for being able to fulfill the requests of the business stakeholders.

(15:46):
Right. Okay. That makes total sense. So I think there's, like there is an initial step where we bootstrap the contact store with as much information as possible. That takes the form of, you know, setup is super easy. All you do is you plop in your data warehouse creds, and you're good to go. Now what happens
is that we for the tables that you allow us to see, we query the information schema. We get as much information as possible from there. We also

(16:13):
sample the data, and then we programmatically generate
context. I think where a lot of people go wrong with that type of step
is that they aren't thoughtful enough about producing the precise context that your application
expects.
They just, like, dump raw metadata into some context window and, like, hope and pray

(16:35):
that the agent figure stuff out. We're my very much of the belief that you need to very deliberately
produce context programmatically
in a way that's guided and specific to your application. That means there's a level of precision and control. Like, I firmly believe that increasingly,
we're gonna move from data pipelines to context pipelines,
meaning that context will be computed. It'll be computed from other context and other data, and that is data pipelining. Right? So that's one of the reasons we we kind of have an engineering approach to it. And the automatically generated data docs are kind of like an initial step to do that. The second critical piece that's in the context store is the oh, and actually, I'll start. All of this is managed with Git, and we think that's very important. Context

(17:16):
occupies
this fascinating
space
that's sort of in between code and data, meaning that context is computed
just like data, but it also very directly
determines
system behavior just like code does. This is why we've kind of set on this path of having programmatically
generated context

(17:38):
checked in to a Git repository.
Because if it's checked in, you can track changes, you can revert things, you could write tooling over it to change it in place. It allows all sorts of flexibility
and precision. Right? So imagine you're really developing this thing at scale. You have evals so you can evaluate the performance of the agent. Right? You can do, like, a bisect against the context just like code, and that's a super powerful dynamic. The second piece that we capture in Git are these manual context corrections

(18:10):
that we get directly from the business stakeholders. So this is another critical piece
of the context puzzle. The first one, again, being programmatic generation
of context. And then the second piece is actually getting the information out of the brains of your stakeholders who actually know the domain and into some governed context store where the agents can utilize that to give correct results. Like, for example, at one of our early customers,

(18:36):
they use the term core
as a kind of special code word for a project that doesn't really mean core, and they flag this. You know? They're like, this has screwed up AIs before,
and they totally hallucinate because, obviously, it has its own idea of what core means in the foundation layer. So we created this context correction that very specifically laid out, like, okay. Core actually means these, this and this, these context,

(19:02):
etcetera, etcetera, and the system performed well. Now what's really magical about the way Compass works is that all of this is kind of captured either explicitly or in an ambient sense
from the interactions that are happening in Slack. So, for example, if the business stakeholder is presented within Slack with a data visualization that looks wrong, like the demo that we give is, like, some sales rep has a 90% win rate, and that never happens. So the demo is, like, you look into it, you do some investigation, you figure out that the sales rep is actually a customer success manager, and then that automatically submits that context back to the contact store and says, like, hey. Don't count CSMs

(19:42):
as sales reps. So and then that, it's then checked into the context store. Knowledge is captured, and then the data team can kind of take that and manage it very explicitly. And that's a very powerful model to boot strap. You know, generally, alternative and more heavyweight systems like semantic layers is to do this upfront
process that is very burdensome and complicated, and you need to send the business stakeholder to a custom tool or a web app or something. And they never do that. So the knowledge stays captured in their head, which does no good to anyone except for them, maybe. But as we demonstrated, humans have limited context windows too. And what's good for the goose is good for the gander here. So it's important to get that out of your head and into a place where the the agents can take advantage of it. And it's really this lightweight interface within Slack that's really the magic on in in that part of the process.

(20:34):
One of the other aspects
of data teams who are responsible for the care and feeding of analytical systems, particularly when you're dealing with business intelligence,
is that it's often very difficult to gain any real insight into
how
the stakeholders are interacting with those systems. You might be able to have some audit logs to see how frequently people are running certain queries or dashboards.

(21:00):
But beyond that, you don't know why they're going to those dashboards, what they're doing with the information once they retrieve it. And I'm curious how you're thinking about the ways
of bringing more visibility
both to the data teams and at the organizational
level of how the overall
company is interacting with the data assets that you have and are creating and some of the ways that that can create some feedback loops to the data teams either to prune unused data assets or to understand what new data assets they need to generate to be able to fulfill the needs of the organization,

(21:35):
and in particular, how using the conversational
system of record for the company helps to provide some of that visibility?
Oh, it's a great question.
And I think I could talk for hours on this subject because it's not just what's currently happening, but I think the roadmap on this front
is super, super bright. You know, we're still early stages

(21:56):
here. So, you know, I guess, like, just the first thing that happens
is that when
the business stakeholder is in the same channel
as the analyst, and the analyst can literally see what the business stakeholder wants, gets literally compiled to SQL in the data warehouse,

(22:16):
that gap between business language and, like, your column and table names communicates
so much about what is actually happening and what people actually want. If you're with a traditional BI tool or exploratory data analysis tool, that that translation
does not exist in a format that can be discernible by one of the analysts.

(22:37):
So I think that is the dynamic. Just the social dynamics end up really producing a ton of insight. And we are just at the beginnings of being able to use, I like your term, the conversational system of record to drive more value out of that. The initial
kind of, like, feature we have that I think demonstrates that is that we have the ability

(22:58):
to create data requests in a ticketing system automatically based on what's been happening in a specific thread. So let me give you an example of how this works. So I was asking specifically
Compass
about product analytics, and I asked and this but it expands beyond that. So I was asking what I thought was a very simple question, which was, how many of our customers use declarative automation, which is one of our scheduling features? It turns out our warehouse didn't really have that particular feature

(23:28):
explicitly modeled well. So what the AI did, and this was awesome and terrifying to watch, what it actually did is we have Gong
transcripts,
meaning Gong is a system that records and transcribes
sales calls. So the AI decided
to use Snowflake's features of AI based analyses

(23:48):
to it couldn't find the exact information, but what it did do is it found
all the customers that had mentioned declarative automation in one of their sales calls, which was obviously
imprecise, but gave me a hint or at least a floor of how many people use it and interesting customers that use it. It was also terrifying
because those capabilities are extremely expensive in Snowflake. But as a result of this experience, I was like, make a data request ticket so that we have this information first class in the data warehouse. And what the system does is it scoops up that entire conversation. And think about all the context you have. You have you have who asked for it. You have all this SQL that was generated that is navigating the warehouse and trying to figure out where where things are going on. You have follow-up questions. You might have a conversation. Maybe our analyst jumps in and says, like, oh, yeah. We don't have that because of this historical reason. It scoops all that up, synthesizes it, and creates a data request. Right? And that's an LLM assisted process. So we're really building, you know, Databricks calls these compound AI applications.

(24:51):
But what it means is kind of injecting where appropriate
LLM augmentation
and processing
into parts of the workflow. And I think we're just at the beginnings of this. As we develop this product
over time, we will be doing much leveraging the conversational system of record much more. You can imagine
doing post hoc processing on threads

(25:12):
that allow you to
discern with more detail,
like, what suggested
context corrections, for example,
scooping up all that information.
You can imagine having observability and ins insights tools
across all the conversations happening in all different channels across your business so you can understand,
like, what's happening, what data is being requested the most frequently, what data is not being requested.

(25:37):
You know, my vision is kinda like, it's almost you know how
during COVID,
right, Google, you could figure out when the COVID was spreading with Google Trends, and it was ahead of any reporting
because people would start asking about, you know, oh, I'm losing my I I I can no longer taste something. You could kind of see that go through the country and lead

(26:00):
be a leading indicator of the true metrics that get reported. And I think I'm imagining a future where at a company, you can really get a sense of, like, what people care about in aggregate. If you if the analytical queries are shifting in the business, that's actually a good insight in a large organization about what people care about, what they're worried about, etcetera, etcetera. So I think there's a huge

(26:20):
space here
to get broad based intelligence from this conversational system of record and how the context is being accessed to. So I think it's, like, a a very perceptive question, and I'm like, I think, like, there's just a huge amount of greenfield around there.
It's interesting too. I was actually, just earlier today having a conversation with somebody who's building an agentic

(26:43):
coding
platform for
doing software engineering in an autonomous fashion, and
it brought up the whole idea of Conway's Law about how the structure of the software is defined by the communication patterns of the organization.
And once you start introducing these agentic systems, that changes the communication patterns to also incorporate those LLMs,

(27:05):
which by necessity
modifies the structures of the software that gets created. And I'm wondering how you see that analogy play out in the context of these agentic analytical systems and the role that it plays in terms of the design and orchestration
of the data assets that you're building and the ways that people are interacting with those data systems. But because we have these LLMs

(27:29):
in play, it is no longer human to human interaction or human to deterministic machine interaction. The LLM then plays a role in that communication system
and modifies the ways that people are interacting with it.
That is really interesting. The Conway's law analogy, I hadn't thought about, but it makes total sense. Because one of the things that I think is happening in the AI era is that I think nearly every stakeholder relationship is going to be reimagined for this era. And I think part of that is because the new consumption layers facilitate new team organizations because of the Conway's law. So just as an example that's not in data platforms, what I mean by that is that there's a wire there's, like, a all software engineering is going away boomlet.

(28:13):
Right? That was kind of a big conversation, which I thought was complete load of, you know, whatever.
It's a family show. Right?
Tobias.
The, but, you know, PMs vibe coding does not mean that software engineers are going away. However,
the ability
of PMs to prototype
and build things in the native system of the engineers

(28:36):
fundamentally and completely transforms their stakeholder relationship. And I think this is partially kind of one of these Conway law esque effect. And that's also what's happening in Compass between the business stakeholder
and the analyst where the business stakeholder can now kind of do Vibe analytics in their own way and communicate directly in the native medium that the

(28:58):
analyst
can understand. So and this Conway of Law effect
is I am
I am incredibly
bullish
on this
UI interaction
of multiplayer
agentic chat
in b two b context. So, you know, single player agentic chat, like ChatDVT

(29:19):
and its competitors,
have completely
remade consumer software and are in the process of doing so. And I think that this multiplayer collaborative chat is gonna do the is the same order of magnitude change in the enterprise.
You know, we're seeing it right now in Compass because, you know, if you stack that on top of the data platform, you effectively don't need reporting functionality

(29:41):
across all of your vertical SaaS apps. It's just in this one spot, which is super, super exciting. And, you know, and I think this, like, agentic chat is what does it because
you're bringing in people. You can bring in the random stakeholders. And then instead of, like, the agent, I think a lot of people's mental model of the agent is, like, someone's alone
and talking to the agent in an enterprise context. That doesn't make sense. What the Slack modality does is that the agent is a participant

(30:08):
in a collective
conversation that incorporates workflows,
and that is a super powerful dynamic that also kind of changes the communication structures here. So there's a lot of I think people have the wrong mental model of this. There's, like there's also a boom lit about, like, oh, there's gonna be, like, one person startups that are billion dollar companies. And, like, I don't really think that's true

(30:31):
either because I just I don't imagine a world where, like, one human is talking to, like, n agents
and building a company like that.
I think it more of, like, there's there's there could be fewer people, more hyper empowered people, but it's always gonna be hybrid where there's lots of humans and lots of agents and this the the humans are sort of up leveling their work. At least that's the way maybe it's just the way the I want the world to work, but I is I I think it is the way the world will work.

(31:01):
I think it's also
indicative of just the overall tendency for people to take a proof of concept and extrapolate
to a larger scale that is not true. I mean, as with anything in software and technology, it's, oh, I built this system in a weekend, so therefore, I can build an entire production company by the end of the week. But the the the factors of scale are something that nobody ever properly accounts for where you're dealing with

(31:27):
exponential complexity but logarithmic
capability. And so you're you're going to diverge sooner rather than later in terms of what you could actually feasibly maintain. And so I think similarly
with that idea of the one person company where I just have 50 different agents, it's like your your head's going to explode trying to keep up with them. And, eventually, the agents you're going to hit the law of diminishing return where the inaccuracies of the agents are going to start compounding, and it's going to drive your multibillion dollar company into the ground before it ever takes off. And I think it's it's also indicative of the hype cycle that came out with the initial release of chat GPT about saying, oh, well,

(32:03):
AGI is now just around the corner. We're going to have it by the end of next year, and now it's in the end of two years or five years or it keeps getting pushed back.
Yeah. And what is even AGI? You know? Like,
it's a very difficult thing to define.
You know, I use the term
technical debt super spreader and things like that. I actually think that's a specific instantiation

(32:27):
of a more general trend that's gonna be across multidomains. Because, like, I think with AI, we're going to be entering a complexity crisis
effectively. Like, the ability of agentic systems and humans empowered by agentic systems to produce complexity,
junk content, interrelated cons concepts that you fundamentally don't understand
is very, very high. So I think that the ability to manage and model complexity will become only more and more leveraged. You know? And that's that's what I think about when I'm doing agentic engineering is really compartmentalizing complexity in a way where the agents can contribute the right things at the right time. But, yeah, we are it's going to be a very

(33:09):
complicated world with all these agents running around.
And now bringing us back around to Compass and these agentic systems for AI and the role of the data infrastructure and the data teams in that landscape,
what are some of the ways that the
requirements of the data infrastructure change

(33:30):
when you have to support these agentic systems,
and what are some of the aspects that can remain the same and the agent is able to just use systems as they exist today?
You know, it's such a broad
question.
And, you know, the agentic systems
are so
new, and not that many people have deployed them at real scale, that I think it's actually very difficult

(33:54):
to understand
at this time how exactly it's going to impact everything. You know? I think lots of people there lots of people have these like, a lot of people are like, oh, there's gonna be more unstructured data. I don't even know if that's true, for example, because, actually, for these AI systems to operate over and do super leverage thing, you actually want tons of structure and tons of metadata, tons of context. You know, I think that real time,

(34:19):
more complex
workflows
are going to be incredibly
important.
I'm quite bullish on systems like Temporal,
for example,
to manage
the agentic workflows and the complicated agentic workflows that go on because the ability to pause and resume compute

(34:40):
will be very important. Because, like, one of the interesting things happening is that agentic workflows are so high latency.
Right? Users are now trained
to think for a computer to think for minutes on end on your behalf, which is very different from, say, the web era where, like, every millisecond
counted. And utilizing

(35:00):
computational resources efficiently
in those contexts, I think, is actually quite challenging. There's any number of things I could spout off about, but I think I think anyone who gives you definitive answers about how all this stuff is gonna impact infrastructure
doesn't really know what they're talking about. Because, like I said, every layer in the stack is getting disrupted,

(35:21):
and both the consumption layer is changing.
That implies changes to the compute that's actually running, but also the AI is impacting the way these infrastructure things are built. So there's multiple dimensions of variability right now, so I think it's very difficult to project beyond pure print conjecture what's gonna be changing.
One of the other aspects of

(35:41):
bringing an agent into the equation is obviously
cost because LLMs have very unpredictable
cost patterns.
And so
you don't want to route every request through the LLM and especially having it do a huge body of work if it's something that you already have a stored data asset for. And I'm wondering if how you're thinking about some of the methods around

(36:06):
taking some of the common questions and interaction patterns with that agent and being able to then either cache them for quick retrieval
or materialize them into a more durable asset so it's not something that gets recomputed every time or just some of the ways to enforce
the interaction patterns

(36:26):
of the
stakeholders to say, you don't have to ask the LM this question every time. You can go here for it, or it's going to deliver this to you without you having to take any action into some of the means of mitigation of unbounded cost.
Yeah. No. It's a you know, I felt this very personally
because I went from zero to, like, a 100 on agentic coding this summer,

(36:51):
and I hadn't signed up for the Claude Max plan. I just used our corporate account,
which doesn't have that sort of high usage limit. And in my first two weeks of Claude code usage, I cost the company $3,000.
So we're able to get it under control, but, you know, you can consume
a lot of cost doing that. I kind of like I don't even wanna think about, like, how much natural gas was burned to produce those, you know, 10,000 lines of code or whatever.

(37:20):
So the cost you know, I mentioned a complexity crisis before. I also think there's a cost crisis coming. And I think the first answer here is that earlier in this episode, I mentioned that I think context pipelines are the kind of the new data pipelines.
And that is gonna one, is that you wanna be precise
about
when and how you recalculate

(37:41):
context.
And that means it's a data pipelining problem, like doing event based computation,
crafting the computation in a very specific way, and then producing it in a highly tailored way so it's perfect for your application.
So
writing data pipelines that become context pipelines and then matching that with context engineering, meaning taking those produced artifacts and feeding it to the right model at the right time, the combination of those two techniques, I think, are going to be essential

(38:09):
for controlling
costs. You know, because, like, the larger the context window, the more expensive the compute is. Like, prefill is quadratic with respect of context window size, and it just determines a ton about model performance. But I think that the cost crisis coming is real.
I think the chickens will come home to roost

(38:30):
for a lot of these firms
who aren't
passing through enough of the compute cost to their customers,
and their customers will have a rude awakening and churn. And I think, you know, some of the coding startups are encountering that challenge right now. So, yeah, I think there's gonna be a huge amount of techniques,
and those techniques will stay

(38:51):
extremely relevant even as the models
get better and even as they get cheaper
too. Because some of this context management,
I view it as kind of almost like big o notation or algorithmic complexity,
Meaning that no matter how good Moore's law, an o of n squared sort algorithm
can only go so far because no matter

(39:14):
how fast the processor is. And I think the same thing is gonna be true with context engineering. You know? Like, we're even seeing this now. Like, we're getting to a million and, like, even enormously large context windows, but they have enormous amounts of diminishing
return. And even it can be a negative thing if you pollute the context with contradictory

(39:34):
information. You know, this is famously called, like, context poisoning or context rot and all this stuff. So I think, like,
context engineering
is gonna be more and more expansive. I think that is gonna be
a common theme to control cost. And then beyond that, having more control over fine tuning. And, I think there's a whole undiscovered country in terms of democratizing

(39:57):
fine tuning and then having the model providers have built in capabilities so you can do fine tuning over their closed source models. But it is early days. But we are going to burn a lot of money and energy along the way, and it's gonna be becoming increasingly more important to control it.
As you have been going through this journey of building Compass,

(40:17):
testing it out, getting it in front of some early adopters.
What are some of the most interesting or innovative or unexpected ways that you've seen teams apply these agentic capabilities on top of their existing data investments?
That is a good question.
You know, I think the thing that has really struck out at me is that, you know, the only in our funnel you know, we're still on the order of dozens of users, and we're you know, this is week of October 6, and we're opening the floodgates a little bit. You know, we have hundreds of people on the wait list. I think the thing that has struck out at me is that effectively, once people connect their data warehouse to the system, we have a 100% retention on the platform,

(40:58):
which is crazy. So
people start using it, and they the usage is intense, and they get
tons and tons of stakeholders
in the system. Right? And, internally,
you know, we actually purchased some datasets that we're gonna make public that effectively are the moral equivalent of, like, the pitch book data. So companies and their fundraising histories and their revenue numbers and all this sort of stuff. And then a people database, which is kinda like the LinkedIn dataset. And just Compass

(41:28):
plus those things
make it, like, the best
prospecting,
meaning salespeople finding customers that will be open to purchasing the product. Prospecting and recruiting tool, it's, like, more powerful than LinkedIn sales navigator.
It's crazy. You know, SQL is so powerful. And natural language on top of SQL, doubly so. So we've already seen every single ops role use this tool

(41:52):
effectively
across
recruiting,
across
HR,
across
FinOps,
RevOps, sales ops. There's lots of ops these days. Product ops queries,
doing these sort of things on our own data platform,
in fact. So the breadth of use cases has been pretty awesome.

(42:14):
And,
yeah, you know, it's been great. I
the a lot of our early product market fit are actually investment firms. So and they use it for interesting stuff. We thought they would use it for kind of trying to find new companies to invest in, but they have a sales pipeline just like a but their sales are investing in something. So they kind of know what stage they're looking at their company in. They kind of have a pretty formalized pipeline. And the

(42:40):
generally, there's one investment ops person who kind of manages that, and they have to field requests from the partners, which is often very time sensitive and stressful. But they've actually gotten their partners, the people who run the firms, to use this tool directly
and which has both been efficient, but also an incredible stress reducer, which is really literally why on our marketing site, we can have a pull quote that says, quote, unquote, Compass saved my life, which is always something you wanna hear as a founder.

(43:08):
But the reason why that, person said that
is because we not just saved her time, but enormous amounts of stress dealing with kind of time sensitive time sensitive requests from very important people. So I think this investor use case has been pretty interesting to see.
In your work of building the system
and
understanding

(43:28):
the
capabilities and use cases and limitations
of an agentic analytics
platform and how to tie it into
existing data infrastructure and data assets, what are some of the most interesting or unexpected or challenging lessons that you learned in the process?
I mean, it's still early days. It's amazing how
once you once you go from

(43:50):
one person on the go to market team being able to interact with the data warehouse to 80% of your team being able to interact with the data warehouse, you really start to see how many gaps there are both in understanding
and your data model, but also, like, these gaps and understanding of, like, what people actually care about. So
I think that has been

(44:12):
super interesting to have rollout
in real time.
And what are the situations where you would advise against going down the agentic path for
these
exploratory
or analytical use cases?
Yeah. So the you know, we don't call it a BI tool. We call it exploratory data analysis because it's actually a very distinct use case. The for

(44:38):
BI tools often
drive
absolutely
mission critical
things,
like revenue reporting that is subject
to regulatory scrutiny,
or comp decisions,
or
pricing
decisions.
And Compass is explicitly
not designed for that use case. It is for exploratory,

(45:00):
rapidly rapid directionally correct data analysis, which is a very different use case. So we don't purport no desire to be a replacement
for those those core BI assets. We think that should be managed by the BI tools. It's Kind of one of our principles here is that we want to designers call this truth in materials. We don't want to pretend like it's not an LOM. We don't want to pretend

(45:20):
that it's a 100% accurate or bulletproof. It's not its purpose. Right? We want it to be rapidly correct or directionally correct and eventually correct. And by eventual correctness, I mean that the context store gets added to, and then the the queries get more and more accurate over time to some kind of asymptotic level. So, you know, there are domains where

(45:41):
absolute precision in all cases is absolutely required. That is not,
it is for facilitating,
as I said, directionally correct rapid analyses.
And as you continue to
invest in and iterate on this agentic
exploratory
analytics use case, what are some of the things you have planned for the near to medium term or any particular projects or problem areas or capabilities that you're excited to explore?

(46:06):
Yeah. So one thing I'm super interested in, I think for obvious reasons, is deep integration
between Compass and Dagster plus and Dagster.
You know? And this comes in many, many different forms, both using data pipelines to produce context and manage context,
integrating the context store with our operational system of record, and then also

(46:28):
using this tool. You know, we have this ability to create data requests,
which can be, like, very detailed, and then using that as a basis of AI agentic authoring workflows,
which we actually have kind of working already and is very, very effective. So I'm very excited
for that dimension, kind of integrating Compass even more first class into data platforms. I'm very excited to work on our at a, a more enterprise SKU of Compass. I think these kind of organizational observability

(46:58):
features
will be part of that, as well as sort of on prem versions, which will have its own challenges, but will really unlock usage in a ton of places that will deliver a ton of value and we feel will be very successful in terms of being a healthy business. Yeah. And then just kind of you know, this the way this is set up, you know, we can attack all sorts of interesting use cases

(47:20):
one by one by one. You know, we view just in the initial stages,
right, every dashboard
in every
vertical SaaS app is our opportunity in effect.
And that's very exciting to see. And I'm excited to so much of the information and knowledge work that happens still,
it's so much drudgery.

(47:41):
Manually
fielding a request to add such and such to this Salesforce dashboard and then, you know, hooking this and that up. And I think people are a little too pessimistic about, like, AI taking all of our jobs.
I don't think that will happen. I think people will move up the stack and have to deal with much less drudgery. And that's kind of the way I approach this and what I seek to do as participating and helping with this product. I think the future is bright. You know, it it kind of always comes up. I'm maybe anticipating a question you might ask, but,

(48:14):
should my kids study software engineering? Is software engineering gonna have a future? And blah blah blah. And I couldn't be more bullish
about the future of software engineering. It's just gonna change the definition of what software engineering is. But the the core foundations of learning how computation
works, learning
how to think about this stuff from first principles will only become more leveraged.

(48:36):
Are there any other aspects of this space of agentic analytics, the work that you're doing on Compass, the
leveraging
of existing
data infrastructure and data assets into this more
AI driven
interaction pattern that we didn't discuss yet that you'd like to cover before we close out the show?
No. I think we've we've done a pretty good we've covered a lot of ground, so I think we'll leave it here.

(49:01):
Alright. Well, for anybody who wants to get in touch with you and follow along along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap of the tooling or technology for data management today.
It's always
extremely unfair when you ask vendors this because we're morally obligated

(49:21):
to talk our own book. But I am super interested
in I'm, like, obsessed with this context engineering
notion. And I think it's, like, gonna be a defining discipline for the next ten years. I think it's super, super early days. I actually think about a lot because my other kind of passion project right now is figuring out how to deploy

(49:42):
AI and agentic authoring in real and large software systems. And I am very interested in the problem of keeping
this sounds simple, but I think it's a big problem.
Keeping markdown files checked into a project
up to date with the underlying code. Because I think this is a big problem. Because I think of these markdown files, generally, that are computed by agents, they're just to me, they're just token caches. Right? That LLM has, like, evaluated a bunch of tokens in the code base and then materialized that knowledge in more condensed form. Right? And I think that's actually gonna happen recursively in large software projects. But keeping it up to date, it's actually

(50:17):
another instance of a data pipelining problem because you can't recompute it every time because it ends up being too expensive. So how can you do that intelligently
and keep it up to date? I think it's just one pillar of what is gonna be needed to do AI accelerated
software engineering at scale. That's my term, by the way. I despise the term vibe coding and hope we don't talk about it here.
Yeah. An alternative term that I heard recently is AI native engineering.

(50:42):
That's pretty good. I will take it. I will take it. Agentic engineering is pretty good too, but I don't know. Agentic is like one of these words now, which I, like, only use as a last resort.
Absolutely.
Well, thank you very much for taking the time today to join me and share the work that you've been doing on your agentic analytics system and, the experiences that you've learned there. So I appreciate that, and I hope you enjoy the rest of your day. Alright.

(51:09):
Thanks, Tobias. Thanks for having me.
Thank you for listening, and don't forget to check out our other shows. Podcast.net
covers the Python language, its community, and the innovative ways it is being used, and the AI Engineering Podcast is your guide to the fast moving world of building AI systems.

(51:32):
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about
it. Email hosts@dataengineeringpodcast.com
with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.

All Episodes

Episode Transcript

Popular Podcasts

On Purpose with Jay Shetty

Ruthie's Table 4

The Joe Rogan Experience

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Context Engineering as a Discipline: Building Governed AI Analytics

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}On Purpose with Jay Shetty

Ruthie's Table 4

The Joe Rogan Experience

All Episodes

Context Engineering as a Discipline: Building Governed AI Analytics

On Purpose with Jay Shetty