All Episodes

September 5, 2025 13 mins

Explore the latest challenge with Neo4j vector indexes, demystify Model Context Protocol (MCP), and hear insights on vibe coding and Retrieval-Augmented Generation (RAG).

What's Inside:

  • Confusion around Neo4j vector indexes - models and dimensions
  • Why knowing the embedding model matters for vector similarity search
  • The limitations of current Neo4j vector index metadata
  • What is Model Context Protocol (MCP) and why it matters for generative AI
  • Real-world analogies for understanding MCP (microservices, snack choices, Docker containers)
  • The power of MCP servers for secure, modular data access
  • Article highlight: “From Gimmick to Game Changer – Vibe Coding Myths Debunked”
  • How AI coding tools and generative AI are lowering barriers for developers and business users
  • Risk mitigation vs. risk avoidance in adopting new technologies
  • YouTube livestream: “RAG Was Fine, Until It Wasn’t” – lessons from Neo4j Graph Academy’s evolution
  • The importance of focusing on goals over syntax in development

Links & Resources:

Thanks for listening! If you enjoyed this episode, please subscribe, share, and leave a review. Happy coding!

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:05):
You are listening to the BreaktimeTech Talks podcast, a bite-sized tech
podcast for busy developers where we'llbriefly cover technical topics, new
snippets, and more in short time blocks.
I'm your host, Jennifer Reif, anavid developer and problem solver
with special interest in datalearning and all things technology.
I went from feeling that therewasn't much to discuss this week

(00:27):
to feeling like I had a lot to say.
I discovered something new aboutthe Neo4j vector index this week.
You can tell me if you thinkit needs updated as well.
And then there was some recent confusionover not just the definition of MCP,
but how it's used and why it's useful.
This has me revisiting this topic abit to give my take on the subject.
Finally, I caught up on somecontent that challenges the way we

(00:50):
think about vibe coding and RAG.
Let's jump in.
I was exploring more on the Neo4jvector index, creating a project that
integrated with an already existing publicdatabase, a Neo4j graph database that
I could access and utilize the data andthe vectors that were already in there.
However, I found that I couldn't figureout exactly what model was being used

(01:15):
for the particular embedding field on theentities that I was looking to search.
So for instance, I'mconnecting via an app.
Spring AI requires you to specifya model that you're going to
be using for the embeddings.
Because if you send in a questionand it creates an embedding using
one type of model, but then you'vecreated embeddings in your dataset

(01:36):
using another type of model.
Those two won't line up.
You won't be able to do a vectorsimilarity search on that.
I talk about this in a previousepisode of this podcast.
I will try to link that as well, howyou need to try to stay at least within
the same model family, if not theexact same model for when you in embed
your data and create the vectors aswell as when you send in questions and

(01:59):
create embeddings for those as well.
Spring AI requires you to specify a model,but I went out to the public database with
all the data ready for me to use, and Icouldn't figure out which model was used
to create the embeddings in the data.
I asked a colleague that Iknew had helped and contributed
in putting together the data.
And was like, Hey, can we figureout what model is being used,

(02:23):
just looking at the vector index?
I can run a show indexes or show vectorindexes in the Neo4j database and see
the list of vector indexes, but I stillcan't see which model is being used.
And the person responded that the onlything that you can see is the number
of dimensions that are being usedfor the vector, for the embedding.

(02:43):
The number of dimensions, howlong that array of floats is
that's created for the vector.
Now this might seem like a prettygood indication, however, there
are a few models that overlap.
For instance, there's more thanone model that uses 1,536 as
the length of the embedding.
This could be really confusing,even looking at the dimensions, I

(03:04):
wouldn't necessarily know exactlywhich model was being used.
And also it's a little bit cumbersome,too, to have to look at a number
and try to either remember ordo a quick search on which model
uses this number of dimensions.
When you create a vector index in Neo4j,you can set the configuration for the
number of dimensions as well as thesimilarity function that you want it

(03:26):
to use, either cosine or Euclidean.
As an example, you could create a vectorindex for a particular node or label
on a specific property, and you couldset some index configuration for the
dimensions and the similarity function.
However, nothing else gets set whenyou create the vector index, so then
after you create it, you could doa show vector indexes, yield star.

(03:49):
If you just show vector indexes,it'll show basic information
about the Vector index.
But if you yield star after the end ofthat show vector indexes, it'll pull back
everything it has on the vector index.
But still, you're only getting thedimensions and the similarity function
that you've set in the configuration.
You're not getting any extra data on that.
This would be nice to add when youcreate a vector index, at least in Neo4j,

(04:11):
to include the model as well, so thatsomebody who may not have created the data
and the database and is not intimatelyaware of how it was done, could still
figure out the information they needin order to connect applications to it.
I've recently discovered there's someconfusion around model context protocol.

(04:31):
Now, I think a lot of us know whatmodel context, protocol is, at
least the general definition of it.
But how does it actually functionwithin a generative AI system?
And I have mentioned this before on thepodcast, but again, I've had a little bit
of recent confusion and I wanna discussa little bit better here now that I have
a better understanding of it as well.

(04:53):
I've started thinking of an MCP server,like a microservice, where you've
bundled up a set of functionalities andcreated an entry point or more for an
application to access that information.
As an example, you might want to createa customer API that can retrieve customer
information, things like recent orders,contact information, and location.

(05:15):
However, you might not want someone toaccess the amount that that customer
has spent with your company or any PII,personal identifying information, that's
associated with that customer account.
Instead of creating a database role foran application that locks down certain
fields, but allows that role to accessthe database however they see fit, you

(05:35):
might create a microservice, somethinglike a mini application that sets up
an endpoint or two that pulls basiccustomer information and recent orders.
This provides a defined path intothe data without trying to limit
the flooded access that you mighthave to do with a database role.
It also makes a microservicevery modular and consistent.

(05:57):
Different projects or applications don'thave to craft the exact same queries
to retrieve customer information.
They can access the predefinedservice that's already available
that will give them what they need.
I also have another example.
I'm a parent of young kids, soinstead of providing a whole snack
cabinet for the child to choose fromto pick a snack for the afternoon.

(06:18):
You could offer a choice between two orfive or 10 or however many options you
want to and let them pick from that list.
It provides some set of limitationswhile also giving some options
and some flexibility there.
MCP is a service, or a tool, for a largelanguage model to access system resources

(06:39):
in a consistent and predefined way.
The large language model doesn'tneed to craft a query to a database,
which potentially could be inaccurateor access information it shouldn't
have access to, but instead using aexisting tool that is provided to it.
This combination proves powerful andcustomizable with large language models

(06:59):
and MCP, Neo4j and so many vendorswill offer verified MCP servers that
you can integrate into your project.
Think, instead of creating a customintegration to pull recent listings,
whether that's projects, customers, whathave you, you use a trusted MCP server
to access those predefined methodsthat the vendor has already set up.

(07:20):
You can also think of MCPservers like Docker containers
that are provided by vendors.
They are trusted bundles oftechnology that you can spin up
and work with out of the box.
Again, providing simple, basic entrypoints that you can access and utilize
the information the way you need to.
I'll provide a link toNeo4j'S MCP servers.
There are several now, and theyprovide a lot of different options

(07:41):
for ways you might want to connectto Neo4j and use the data within it.
The content I wanna talk about,there are two different things.
The first is an article calledFrom Gimmick to Game Changer
Vibe Coding Myths Debunked.
This is from my colleague Michael Hunger,and I really liked the exploration
here of looking at vibe coding as atool in your toolbox, but also needing

(08:05):
to set up some sort of boundariesand guidance for how to use it well.
Vibe coding often gets a bad rap.
There definitely do need to be someguardrails and processes put in place
around it, but it also opens the door toopportunities that never existed before.
There's a couple of brief thingsI'll cover from my own experience.
I was asked a few weeks ago tostep in and cover a workshop that

(08:29):
was based on a Python project.
I'm not a Python developer,as many of you might know.
I'm a Java developer by trade.
I needed to come up to speed onsome of the Python things and
tweak a few things for some of mypreferred approaches to instruction.
I was able to go in and with, somecoding tools, was able to adjust things.

(08:50):
I focused on what it was I wanted theattendees to get out of the training or
how I wanted to present the information.
And I let the AI coding tools decide whatthat Python syntax needed to look like.
Another option islooking at media content.
For instance, video and podcastingtools, where I'm able to produce content

(09:11):
without learning all of the ins and outsof media and video and audio editing.
It allows me to participate in certainways that maybe I couldn't before.
Now, not to say that I'm not learningthe ins and outs as I go along, but
it definitely sets me up for bettersuccess in the long run because
I'm able to get off the ground andrunning faster by incorporating some

(09:35):
of these tools from my tool belt.
The article goes into how businessusers might not have been able to
explain their business requirementsin a technical environment or
might've had to deal with multipleconversations back and forth to produce
what they were actually looking for.
Now, with some of these generative AItools, they might have a seat at the
table to help create better qualityproof of concepts or visuals that can

(10:00):
better communicate what's needed andwhat the results should look like.
Also, think about those in communitieswho might not have some of the resources
that other communities might have,think levels of instruction, technology,
expertise, finances, formal programs.
Now might have access to learnand explore almost anything with
a laptop and a few LLM credits.

(10:22):
The article also digs into analysis ofrisk mitigation versus risk avoidance.
If you think about the major changesthat have happened to society over time,
things like going from horses to carsas modes of transportation, land to air
travel, even letting kids try somethingnew or any major change in society.

(10:43):
There are ways to approach eachof those things that can help
minimize risk or at least notlet you take risks unnecessarily.
And that's a wiser approach that weshould also take with AI as well.
We can incorporate how can weminimize the risk or make the outcomes
potentially less painful by settingup some guardrails and some good

(11:05):
processes that help us use it safelywhile still helping us move forward.
The first point in the article talksabout something that I mentioned earlier
in regards to this article, and it'swhere you can focus on the goals of what
you're trying to do versus how to do them.
Oftentimes as a developer, I feel likeI spend a lot of time fussing with the

(11:26):
syntax or clean code or getting somethingjust to work, when it would be a far
better use of my time to focus on howthe process should flow or the objective
that I'm trying to get on the other side.
I think this is also a really goodcase for the AI tools and integrating
them into our workflows because wecan focus on the goal and let these

(11:48):
tools help us provide the syntax andthe specific steps along the way.
The other piece of content I cameacross was a YouTube video from a
colleague and former colleague of mine,Will and Adam, and it's a live stream
called Rag was fine until it wasn't.
I found this really interesting becauseit's showing a real world use case on the

(12:09):
progress of the solution behind the Neo4jGraph Academy online training platform
over the course of the last several years,going from non generative AI solutions
to now incorporating generative AI andusing it to help people and developers
specifically learn technology better.
It uses data points and feedback tohelp influence the opportunities and

(12:32):
the way that they have designed andbuilt the system over time to provide
some really unique custom built trainingopportunities for people coming in
and trying to learn Neo4j and graphs.
So if you're interested in somethinglike that, I would highly recommend
you check out the livestream.
I will link the recording in the notes.
From vector index features to metaphorsthat help explain model context

(12:54):
protocol, or MCP, I hope you enjoyedthis week's dive into what I've learned.
As always, thanks forlistening and happy coding.
Advertise With Us

Popular Podcasts

Stuff You Should Know
Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

The Breakfast Club

The Breakfast Club

The World's Most Dangerous Morning Show, The Breakfast Club, With DJ Envy, Jess Hilarious, And Charlamagne Tha God!

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.