Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:05):
You are listening to the BreaktimeTech Talks podcast, a bite-sized tech
podcast for busy developers where we'llbriefly cover technical topics, new
snippets, and more in short time blocks.
I'm your host, Jennifer Reif, anavid developer and problem solver
with special interest in datalearning and all things technology.
I think summer is supposed to be a quieterperiod for work and activities and events
(00:29):
and such, but I haven't found that tobe the case for me this year, anyway.
This week, I'll catch you up on whatI've been working on the last few weeks.
First, I gave a GraphRAG workshoplast week where I needed to learn some
Python and the Neo4j GraphRAG library.
That was a little bit of an adventure,but I'll talk about my experience
there and what I accomplished.
Then I've had a busy few weeks makingprogress on lots of different content
(00:53):
that I'll be using throughout the restof the summer and fall for presentations,
live trainings, and much more.
I also have mentioned in previousepisodes that I was struggling to
do GraphRAG with Spring AI advisors.
There's a few implementations out ofthe box that they provide using their
advisors, and I wanted to be able to atleast do the vector search component with
(01:15):
that, and I'll catch you up on what I'velearned and where I'm going with that.
I was finally able to hash out somekind of solution there for using
their Spring AI advisors for GraphRAG.
I'll talk more about that.
That is actually what I'm going to startwith for today is the Spring AI advisors.
Now they have a couple of advisorsthat come out of the box in
(01:35):
Spring AI for vector search andretrieval augmented generation.
So it's their question answer advisorand their retrieval augmentation advisor.
I've talked about this a bit inprevious episodes, but it provides a
customizable implementation of retrievalaugmented generation in generative AI.
(01:56):
I wanted to be able to use the default,out of the box implementations of
these advisors and then tack onGraphRAG so that I wasn't doing a
full GraphRAG pipeline manually.
Previously, when I've done itmanually, I've done the vector search
and set those results to a variable.
And then taken that, passed it into agraph retrieval query, and returned those
(02:19):
results into a variable, then stuffedthe prompt with that variable and sent it
over to the LLM to generate a response.
The default question answeradvisor, I finally have working.
I will leave a GitHub repository link.
It's a separate branch of theSpring AI goodreads repository that
I have under my GitHub account.
(02:39):
And then I also have a separate approachfor using all completely custom advisors.
So I used their default questionanswer advisor for one GitHub
branch, and then tacked on a graphretrieval advisor that's custom.
But then I went completely customand did both a vector retrieval
advisor and a graph retrievaladvisor completely custom as well.
(03:02):
And I have a separate branch on my GitHubrepository for that side of things.
The whole implementation of theadvisors plus, the custom approach
is a bit more complex and morecode than what I had hoped.
But everything does work andseem to operate very well.
It adds a little bit more contextgoing to the large language model
(03:23):
as well, which is gonna make youroverall message length a bit longer.
In a real case, I would probablytrim out some of the unnecessary
data that I was adding at each stepin order to pass that context along.
I've mentioned this in previousepisodes, but I'll walk through
it a little bit briefly here.
Typically you do avector similarity search.
You take those results and Springformats them as a document generic
(03:48):
entity type for the similar resultsthat come back for the input query.
And then you take those and you passthose similar results into a graph
retrieval query to pull related, connectedentities for those similar results
in your graph database, in our case.
And then you retrieve those and you sendthat to the large language model and say,
(04:10):
Hey, what are the recommendations or whatis the best answer based on the similar
results and the graph data, the connecteddata that I've retrieved for this input?
And I've done this allmanually as I've mentioned.
With the advisors, you tack those ontoyour chat client call, and it will
run the vector similarity search, forinstance, and add that as a parameter to
(04:34):
your context for when it goes and sendsthe message to the large language model.
That worked really welljust doing a vector search.
But then tacking on that graph,I had previously had trouble
formatting and passing along thoseresults from the vector search
into the graph retrieval query.
I have now figured out a way to do that.
You just add a separate parameter,to the context and then pass
(04:57):
that into the next thing, andthen you can read that back out.
Now, I had mentioned that I wasstruggling to convert from the
type of format that the context wascoming in as and then reading that
and parsing it and getting the graphretrieval advisor to read that as well.
I was able to do that, but itis a forced conversion from one
(05:18):
entity type to another, which isnot really my favorite approach.
But again, it's working andI'll try to see if there's a
cleaner, nicer way to do that.
But that's where we are right now.
And again, it is working.
The other thing that I had beenmissing in previous iterations of
this is from one advisor to the next.
Because I have multiple advisors going,I need to call the advisor chain and call
(05:41):
their next call method on that in orderto pass and go on to the next advisor.
If you have multiple advisors, andyou're customizing that, you do need
to use that call advisor chain.nextcall in order to go ahead and pass
that on and move on to the next step.
That was a piece that I had been missing,so it wasn't moving on into that graph
(06:02):
retrieval section because it wasn'tgetting that next call method triggered.
So I've added that in.
Now I am able to passfrom one to the next.
I'm able to read from one thing to thenext, and everything is working again.
I will share the advisor code inthe show notes so that you can check
out the two different branches.
One for using the defaultquestion answer advisor and a
(06:23):
custom graph retrieval advisor.
And then another approach for creatinga vector search and graph retrieval
separate advisors completely all custom.
That's one thing.
Then the next thing, I had mentionedthat last week I gave a workshop that was
exploring the GraphRAG Python package.
Now this is a Neo4j GraphRAG Pythonpackage, and it creates a composable
(06:45):
pipeline of components for knowledgegraph construction to use in GraphRAG.
It has components for a data loader,a text splitter, a chunk embedder,
a schema builder, lexical graphbuilder, entity and relation extractor,
graph pruner, knowledge graphwriter, and then an entity resolver.
Again, there's a lot of pieces there.
(07:06):
Everything is documented thoughin the documentation, and it's
a bit complex to sift through.
However, I found that the APIdocumentation was really nice.
There was good explanation of.
What needed to be passed into certainmethods, what methods were available,
and what things were being expected, andwhat was the output of that, and so on.
(07:27):
I struggled a little bit with kindof the overall conceptual thing.
I was pinging a colleague back andforth on some of that segment of things.
But the API, I was able to sift throughand understand what needed to be done
and what needed to be passed where.
I did run into an issue with that packagewhere I wanted to customize two components
in the pipeline and then pass those twocustom components to a full pipeline
(07:52):
and just have it run the full pipeline.
One piece of the pipeline requires adifferent input data format and doesn't
seem to convert automatically, so Iended up needing to run each component
separately rather than the whole pipeline.
So just something to be aware of.
Depending on which pieces andcomponents of that pipeline you
want to run, you might not be ableto pass custom components into the
(08:15):
full pipeline depending on yourinput output data structures there.
If you wanna check out the GraphRAG Pythonpackage, I will send a link to that in
the show notes and you can check that out.
It was a really nice package.
I think there's a few things thatcontinuously will be improved and updated
as time goes on, but this was a nicestart and I was able to sift through it.
(08:36):
Especially as a non Python developer,I was surprised and impressed
that I was able to work on this.
Then I have some updates onjust general developer advocacy
stuff that I'm working on.
I mentioned that I gavethe workshop last week.
I worked really, really hard to helppeople be successful and understand
how to work with Neo4j and generativeAI and GraphRAG all together.
(08:59):
Overall, the workshop went reallywell, the setup piece for just
getting infrastructure up and running.
For instance, a database and data loadedis always the longest poll in the tent.
We are making some progress onsome updates, and I'm working hard.
I'm gonna give this again here in a fewweeks at another event, so I'm trying
to make some improvements there withsome colleagues' help and see if we
(09:19):
can reduce and make that infrastructureset up a bit more efficient.
It was also really nice to connectwith some colleagues that I either
hadn't met at my company or Ihadn't seen in a really long while.
That was lots of fun, but the weekwas super exhausting as well, and
I needed a bit of recovery time.
I also have a bunch ofcontent that's upcoming.
(09:39):
There's lots of conferences andevents going on in August, so keep
an eye on my social channels ifyou're interested in seeing what
I'm working on and what I'm doing.
I have some online trainingthat I'm starting to put some
things together for that.
There's a tech book that possiblywill be upcoming in the fall.
And then finding good data sets for all ofthis content has always been a challenge,
especially when you're looking at built-inconnections for dealing with graph data
(10:03):
plus AI capabilities on top of that.
I'm hoping too that I can add insome tech blogs here and there as I
go throughout the rest of this year.
We'll see where life takes me.
I also wanted to mention thatI'm playing with some AI tools
and I'm shifting my approach inthe way that I'm doing learning.
I've just noticed this over thecourse of the last few weeks.
(10:25):
I spun up Claude code and havebeen playing around with that and
pulled down VSCodium and was ableto install most of my extensions
that I had been using in VS code.
So I'm exploring a playing around withthis IDE and the functionality there.
It has been a really nice combinationusing Claude Code in the terminal.
This is where you'll have to have asubscription if you're looking at Claude
(10:47):
or Claude Code, but you can pop up Claudeinside one of your projects or your
folders, and it's very good about asking,are you sure you want to give access to
this folder or these files in this folder?
And so on.
But it's really nice for gettingsecond opinions or brainstorming
ideas as I'm operating through contentor development work or just social
(11:09):
ideas and where I should go and whatpathways I should follow and look at.
It is also very interactive on whatyou allow it to apply and change
within those files and directories.
I found this to be really nicebefore it makes any changes.
It says, Hey, here are the changesin the diffs that we're looking at.
Are you sure you want to apply these?
Or no, hey, I want to tweakthis and ask for something else.
(11:30):
So it is very interactive, and I foundthis to be a really nice experience,
especially if you like working in aterminal interactive chat structure.
The content piece I wanna lookat this week is graphrag.com.
Now, I knew this had existedearlier this year, but I haven't
really had a chance to really digin and look at all what is there.
(11:53):
But I looked through a few thingsthat I've been providing this as
a resource for a few differentgroups that I've talked to lately.
And it is really comprehensive, reallynice resource for what is GraphRAG.
It's everything you need to knowabout GraphRAG talks about what it
is and what knowledge graphs are.
How you approach chunking, what areretrievers and what do the different
(12:17):
types of retrievers look like.
How you approach named entity recognition.
So basically reducing duplicatesand consolidating duplicate data.
And then different types of graphshapes, things like hierarchical and
memory structures, and how do youdeal with all those different sorts of
graphs and the formats and kind of thearchitecture that you put in there.
(12:39):
I will link graphrag.com ifyou wanna check that out.
This is not just a Neo4j resource.
It's something that the broader communityis contributing towards and updating
to help educate people on what GraphRAGis and how to do it more successfully.
There are so many fast paced updates goingon in this space right now, and it's hard
to stop and review the accomplishments.
(12:59):
I explored outside my comfort zonewith Python and the Neo4j GraphRAG
Python package Then I have a lotof upcoming projects that will
likely have a longer timeline.
I'll keep you posted onreleases as they come.
I finally had a breakthrough inimplementing GraphRAG using Spring
AI's Advisors, and I mentioned how mylearning is starting to shift over time.
As always, thanks forlistening and happy coding.