EP.260 Vector Search Secrets Revealed! - AI-Powered Image Search with MongoDB - Live Demo - The MongoDB Podcast

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:06):
Welcome everybody to MongoDB Podcast Live.
I'm Shane McAllister, I'm one ofthe leads here on the developer
relations team at MongoDB. Today I'm joined by Ninad who
wears many hats and we're going to dive into that during the
introductions and I'll let him explain that himself.
So without further ado for me, let's get started.
Ninad, you're very welcome to the show.

(00:27):
How are you? I'm good.
Thank you for calling you. Very good, very good.
I'm certainly looking forward tothis.
Neena. As I said in the intro, you wear
many hats. I've met you in person before.
I was delighted to have done that at our Mongo DB dot local
session in London last year, where you interviewed live on
our Mongo DB TV channel with Harshit, one of the community

(00:50):
managers here at Mongo DB. But tell us a little bit about
how you came into the foray to be what you are today now, which
is a Mongo DB champion for us. Yeah, so I'm not.
I work in flat field as a seniorengineer and also I'm a Mongo DB
champion and Mongo DB MAG leaderfor Serbia.

(01:12):
I basically started as a, you know, freelance world where I
kept to learn everything myself,you know, in order to be
competitive on the market. And as probably everyone, I also
started with the relational databases in the beginning
because that is what they are teaching us in in college.
But when I was learning things on my own in a freelance world,

(01:34):
I discovered Mongo DB. And since I'm coming from the
JavaScript ecosystem, I found Mongo DB to be the perfect fit
for me. Because the Mongo DB model would
directly map to my JavaScript objects.
The schema was completely flexible if I would.
Because when you are developing applications, especially in a

(01:56):
few ends, it's probably mostly start-ups.
So application is usually involving as the things are
getting discovered. So you don't always know
everything that has to be done in the beginning.
So there are a lot of changes tobe done and that is where you
always have to remodel somethingin your code, redesign the
schema or like refactor the codebase.

(02:19):
But Mongo made it that perfectlysimple where you can just like
add a field, remove a field or change the type and mongo DB
schema like was a perfect fit for me and since then I was
using Mongo DB exclusively always.
Excellent. That's a, that's a perfect pitch
that you did my job for me there.

(02:39):
Nina, thank you so much. So you started to use Mongo DB
and you started to, you know, contribute on our forums as well
and get involved in our wider community there.
And and then you, I've seen yourname a few times on our Mongo DB
Developer Center as the author of some of the articles.
Tell us a little bit about that.Yeah, So since I'm a practically

(03:02):
a developer, I was visiting Stack Overflow every day
multiple times. And after I gained enough
knowledge that I was pretty comfortable with, I was starting
answering questions on the StackOverflow mostly about Mongo DB
and JavaScript, especially Angular and Node JS, which is my
stack. And eventually like people saw

(03:24):
my my answers there and I was one of the main contributors.
I started also contributing on the Mongo DB forums, which is
practically a tech overflow onlyfor Mongo DB.
And that is when Veronica invited me to join the Mongo DB
community officially. I was not the champion back

(03:45):
then. I was only the creator.
But as creator I was able to write blog posts on Nogodb
official blog. And that was like a big exposure
for me. I also I was also writing on my
own. I started my own personal blog
and like I was writing about also about Nogodb there.
After that I was invited to joinMongo DB to help Mongo DB

(04:10):
certification team by helping them create and review questions
for Mongo DB official certification and things to
Jordin, the lead for the certification team.
She nominated me for Mongo DB Champion title and I got it last
year. Excellent.

(04:30):
Well, listen, we're very lucky to have you.
I see you're wearing the some ofthe swag that the Champions have
as well too. So it's brilliant and champions
job. What do you do?
I know you mentioned the meet upgroups of Mongo DB mugs.
Tell us a little bit about that and particular to what you do in
Serbia, please, Nina. Yeah, so in Serbia I'm reading

(04:52):
the the Serbian Mug with my friend Alexa.
So we are organizing the meet ups about Mongo DB topics and we
have we have 3 meet ups so far like with interesting topics
about like crack systems and in general vector searches.
And we are scheduled to organizethe Fort meet up in February.

(05:14):
So everyone that are from Serbiaare listening to this live
stream, like I would invite all of you to join us there.
Other than that, like on Champions, we are basically like
trying to contribute as much as possible with the certification
team with writing blogs, helping, helping community by

(05:35):
answering their questions on various community platforms like
the core flow that we mentioned in Mongo DB community forums.
So basically we do everything wecan to help the to help the
community. Excellent.
But listen, as I, I couldn't have said it better and I don't
think Veronica or Harshit on ourteam could have done it better

(05:56):
either. Nina, that's brilliant.
And as you mentioned, it's Serbia where you're running the
mug. And if you go to and I had the
banner up a little bit earlier, if you go to
community.mongodb.com, you can go and see where we have the
MongoDB meet up groups. And hopefully there's one near
you and you turn up. It's a very kind of easygoing
community. You're going to learn something.

(06:16):
You're going to maybe get some beers, get some pizzas, meet
like minded developers as well too.
Exactly. But if there isn't a meet up in
your area, we're always looking for meet up leaders and we're
looking for people who are quitehappy to do exactly as you've
done. Nina is is kind of say, put
their hand up and say, look, I've used MongoDB, I want to
meet other like minded people and, and you can create your own

(06:40):
mug and we've got resources and facilitators to help you build
that and send you out some swag and T-shirts and, and everything
that you might need as well too.And indeed, in kind of in
conjunction with that, I mentioned it in the introduction
that you had joined us on the live stream at MongoDB dot local
dot local are Mongodb's events that we tour around the world.
We used to just do marquee events in London and New York,

(07:02):
but last year we took them to over 20 cities around the world.
So if you go to mongodb.com/events probably not
populated for 2025 just yet, butif you go there in a while,
you're going to see our full calendar for MongoDB dot locals,
which is another way to meet like minded individuals and kind
of get involved in the wider MongoDB community as well too.

(07:28):
That's a huge introduction. You've got so many hats.
As I said in the in the beginning, what we're here today
to talk is about semantic image search with MongoDB.
And Nina, we've got a wide audience and I'm looking here in
the in the comments and the people joining from Portugal,
Atlanta and Israel and India andAfghanistan, et cetera, too.
And it's great. Please drop into the comments

(07:49):
and say hello. And we've got a wide audience.
So talk to us a little bit. You might have heard about
semantic search. We might have heard about
generative AI, we might have heard about vector search.
Tell us a little bit of the background before we dive in
because I, as I said, you've an exciting demo to show us and I'm
thrilled to see that. But why is semantic search
important and how does vectors help us do that?

(08:12):
So guys you are about to cover the semantic in general semantic
search especially with Mongo DB but also like in a general way.
So as a like a high level agenda, we are going to
introduce the appending model and what are the vector
embeddings, how we can do vectorsearch in Mongo DB Atlas.

(08:35):
And in the end, we will go through the code walkthrough and
we will showcase the applicationdemo as an outcome.
I, I called it in the end, you would be familiar with both
theoretical and practical concepts of how to implement
semantic, semantic searching anyapplication.

(08:57):
This on the right side is actualwith the application that we are
about to, to build. But you will cover that more in
the end. So introduction to embedding
models. So what are the embedding
models? Embedding models are the models
that has a purpose of embedding the unstructured data in a way

(09:22):
that would capture the semantic move and in the as the output it
would produce the array of numbers.
So basically all the semantic move would be captured in that
the output vector which is practically an array of numbers.
The inputs can be any unstructured data like images,

(09:45):
documents, audio, videos, PDF files or whatever.
So the input unstructured data once going through the embedding
model will produce the vector embedding vector, which would
capture the semantic meaning of that unstructured input data

(10:07):
and. That's I think a really key
point. So we all know about searching
and we all understand, you know,that we can search and we've
been searching for years, as long as the the Internet has
been around as well too. But only recently have you know,
and when we search, we're searching for a match or fuzzy
search perhaps as well too, which is kind of like or similar

(10:28):
to etcetera, semantic searching.That is we're, we're we're able
to derive meaning from the data.So as you said there, we can put
an image, we can put in audio orwe can put in larger documents
and the vectorization, the embeddings that happen allow us
to categorize the meaning of that essentially, isn't that
true? Yes, exactly.

(10:50):
So both the meanings and also the relationship.
So the the main purpose of embedding model is to capture
that that semantic meaning in relationships and produce the
output in a way that can be understand and process processed
by machines which is the appending vector.

(11:10):
OK. And when you say relationships,
you're considering there that similar items or similar content
is stored in a in a an adjacent area in the vector database so
that they can easily be paired or matched together.
Right that now. Excellent.
I got ahead of myself. You got it?
Yeah. Yeah.
So what are the vector embeddings?

(11:31):
So as we said, vector and impending vectors are the output
out of the embedding model whichcaptures both semantic meaning
but also understanding the relationships.
That is the most powerful point.So in this example here on the
right side this is of course a simplified example.
We have like words like cat, kitten, dog or houses which is

(11:55):
practically here vector with 7 dimensions is created for each
of them for like demo purposes. So each item in the embedding
vector represents one feature ofthe vector and the number here
represents how that feature contributes for that for that

(12:19):
whole for whole context. So when we would like to then
plot that and for plotting it's also used like reduction in
dimension in order for us to see.
We can see that cat and kitten would be really close to each
other because they are semantic.Semantically really similar.

(12:40):
Dog would be close to cat and kitten, but it would be much
much further than a kitten from cat, but again it is closer than
houses. So we can see that this would
kind of be animals and this would be something else.
The next step it would capture the relationships.

(13:00):
That is the below example is a good representation of that
where we can see that for example, men and women are close
to each other. It is the king and queen are far
away from men and women, but thedistance between king and queen
is kind of the same as the distance between men and women.

(13:21):
So the relationship part is maintained.
So a bending, a bending model would capture both semantic and
the relationships. So where this all began?
So in the 1990s we had somethingcalled MMS and LSTMS.

(13:42):
That is Recurrent Neural Networks and Long Short Term
Memory, which is a subset of thefirst one.
They were basically used to process sequences, but they
didn't work that well until 2017.
Their Transformers were introduced in a paper called On

(14:04):
Tension is All You need. And that Transformers model that
is actually this image on the right side, which represents the
core architectures of the Transformers models.
That is where that open the doorfor future improvements.
And that is the base of our today's elements that we can see

(14:26):
like GPT, Gemini Code, Llama from Meta and and others.
So the embedding model is actually based on the
architecture of the Transformersbecause transformer scales two
parts, left part is encoder of the transformer and the whole

(14:49):
embedding model is actually based on this encoder part on
the actual transformer. So that is basically where where
we got to the to the embedding vectors and the pending models.
Perfect, Yeah. So what are the applications of
the bending models where there is a lot of use cases?

(15:10):
First is natural language processing where we can have the
sentiment analysis, for example on social media where you would
like to see about some comment is it like a positive or
negative. Then you have like a test text
classification that is on the right side where we have like
text visualization of book embeddings where we can see that

(15:34):
even in two-dimensional space asin this image you can see that
the ones are non fiction and this on the most right side is
the science fiction ones. So even there is there is a
separability between the classeseven in two-dimensional space.
So it is really good for search,also information retrieval.

(15:55):
It is a semantic search that we are going to cover today with
our demo application. And also it can be used for, for
example, recommendation systems where maybe networks would like
to suggest you some good movie based on what other people
similar to you like. So that is how the

(16:15):
recommendation systems would be built.
And also we can use it for clustering and similarity
analysis. Excellent, excellent.
Great to see. So let us start with the
introduction. What are the vector databases?
So vector databases are the databases that can efficiently
store high dimensional vectors and search through these vector

(16:40):
as a whole. So we have two types of
databases like I called it here,traditional databases and
dedicated databases. Dedicated databases would be the
ones that are covering only the vector that are storing only and
the vectors and they are specialized only for vector
search. On the other side, we have

(17:01):
traditional databases that also has the vector search
capabilities, among which is of course Mango.
That allows us to both store ourgeneral data that we would store
in our application together withimpending vectors and the
ability to search to execute thevector search, which in my

(17:21):
opinion is much more intuitive and useful because you don't
need to maintain two different data sources.
You would do everything in one data source.
For example Mongo DB you would execute vector search but all
the metadata about the things that you are searching for is
already there. So easy.
Easy for you to fetch and process in your application.

(17:44):
Yeah, and that's a really good point.
And I suppose if people want to learn a little bit more, they
can just go to the Mongo DB siteproducts, you just do the drop
down there and you search for Atlas vector search there as
well too. But we launched it in June of
2023 at our dot local in New York.
And for us it's been a game changer.

(18:06):
It's really allows Mongo DB to leverage, you know, the data
that's already in Atlas or the data that's not yet in Atlas,
but you're going to bring it there and to use in your
generative AI applications. Because as you say, Nina didn't
quite eloquently. Your data's there, but your
vectors are stored alongside thedata in the same document.
Whereas before with traditional vector only databases such as

(18:29):
Pine Cone for example, you were doing a big round trip, you were
managing two data sources, you were essentially a lot more
extract, transform and load of your of your data, a lot more
ETL pipelines, right? Exactly exactly.
So now how to measure the vectorsimilarity between the two

(18:49):
vectors? There are basically 4 main
functions. One would be squared euclidean
which is basically the the closest path for cost of
distance between two vectors. We have a machete which is the
same but that allows us only to go on the coordinate axises and

(19:13):
we have cosine and dot product. Cosine is using the angle
between the two vectors and dot product is using both the angle
and magnitude. It turns out that using angle is
much better for high dimensionalvectors than using the aquidian

(19:33):
and Manhattan. So that is why in most
applications you would see that the either cosine or dot product
are being used. So now this is a little more
complex, but I wanted to quicklycover like how the vector, how
the vectors are stored for fast retrieval.
Basically when we want to execute the vector search, we

(19:57):
can have really a lot of vectorsin our vector database, really a
lot of impending inside it can be millions and millions or even
billions. So we cannot always.
If we're one input impending vector, we cannot go and try to
compare it with all other vectors within our database
because it would take so much time and it is also

(20:20):
computationally expensive. So what we want to do is we want
to only search in approximate nearest neighbors.
So that's that is on the left. On the left side is a technique
developed by Spotify and it's actually its name is Approximate
Nearest Neighbors. Oh yeah, that is how they named

(20:41):
it. So.
I didn't know about the Oh yeah piece.
I knew Ann Brilliant. And practically how how that
cracks is you would divide firstthe vector space into two into
two subspaces. We can see this this line here.
And that would reproduce like for us in a tree structure like

(21:01):
2 paths. Then each of the sub subsets you
would you would then divide another again by half and then
again by half until you have meaningful small data set.
And then when you have an input vector you would search, you
would see where he would fit in which or less subspace and then

(21:24):
you would search nearest neighbors there.
So that turns out to be the muchfaster and computationally much
less expensive. So on the other side we have
HNSW which kind of does the similar thing, but on a layer
like structure, where for example, this is our starting

(21:44):
point, our input. So on the first layer we will
try to see what are the closest,the closest matches for us
closest vectors, we figure out that is these three.
So once we project these three in the second layer, we will try
to find their closest matches and then we will project it is
kind of similar as on the right side with tree, but instead of

(22:07):
trees we have layers and these two approaches are the most used
ones under the hood when we wereexecuting the vector search.
Perfect, perfect. So.
What is now the semantic search and how does that whole process
works? As we said, we have unstructured
data initially as a starting point.
It can be any kind of unstructured data, images, PDFs,

(22:29):
audio, video, whatever. We put that through the
embending model and we get embending vectors as an output.
Then we would store these embending vectors in our
database and then after that foreach user query input, we would
create an embedding vectors fromthat input and try to find the

(22:53):
most similar vector in our database from the unstructured
data that we created before. And that will basically results
in us having the whole semantic search implemented.
And it's key to point out, because I know we've missed this
or we got questions on this and in previous kind of similar
talks around vector searches, that your query needs to be

(23:16):
vectorized using the same embeddings as your data did as
well originally. Yeah, the same embedding model,
Yeah, OK, I didn't throw that here.
It is really important. So for practical example, this
is the application that we are going to build.
I found like a good data set of the e-commerce products, a

(23:38):
website called Kego. For ones that doesn't know, Kego
is a really good website to findany kind of data sets that you
would like to play with. And they also are are organising
all the competition about like finding the best machine
learning model for classification, classification

(23:59):
or for a given data set. So I took the data set from
Kegel and then we will use the impending model from Google
Vertex AI. And this is interesting because
I I will going to use the multi model model which will
practically allows us to both search by either text or by

(24:25):
either input images. Because this model it's a multi
model model, it is saving the appendix both from images and
for text are in the same semantic space and in the same
dimensions. So that is really important
because whatever the input is our query, either text or image,

(24:45):
it will be able to search in thesame semantic space.
Which is ideal. And that's why, yeah, you see
the multimodal stuff coming to the fore now.
And you know, wasn't always the case, but it's great to see.
So many things. Yes.
Yes. Before we get into your demo,
this has been a superb overview of everything.
Actually, I'm interrupting your tech stack there.

(25:06):
Talk through that Nina, before we get into the.
Yeah, I think this is the last slide for probably.
So the demo is created via the Nordest on the back end and
Angular that is a software beingused.
Mongo DB is a vector database, vertex AI multimodal is used as
as a appendix vector. And as I said, the data source

(25:26):
is fetched from downloaded from Cagle and that's it.
I love that quote. Perfect.
Just before we get into the the demo, which I'm really excited
to see, two questions really from our comments and please
throw any other questions you might have in there and we'll
take them on board if we can. So from Michael, he when we had

(25:49):
up the nearest neighbor diagram there he asks what if the
nearest neighbor is in an adjacent part of the index, So
in one of the different colours perhaps?
So that that is the thing with nearest neighbors approach, we
are not guaranteed to always getthe the perfect, the perfect
results. So that is kind of like a trade

(26:09):
off with the speed and cost withthe trade off is accuracy.
These models are actually really, really good.
But sometimes maybe if you want to fetch like the 1st 10
results, maybe it can happen that the 10th result is not the
one that you would have if you actually executed the full scan

(26:34):
throughout the whole data set. Yeah, and I think if anybody's
played with LLMS and tried to build, you know, search apps,
semantic search apps, even even kind of chat apps etcetera as
well too, they're they're familiar with that.
It's not always, always perfect,but as you say, it's it's a
compromise perhaps as well too. But Michael, thank you for that
question. One for me really, but a

(26:57):
compliment for you, Nina. First of all, Yusuf was inspired
by a career journey. He's in background is JS and he
wants to add more skills and he's looking to help sustain and
boost the career. Yusuf, we have the Mongo DB
University if you want to learn more about Mongo DB, it's a
super resource. It's entirely free and you can

(27:18):
go in there and take all the courses at your leisure.
Create, you know, you can go in there and not even register and
start courses, but if you register, it'll store your
learning plan for you. And the only thing that if you
want at the end is to get Mongo DB certified.
There is a fee for that certification course.
But if you're a student or startups, et cetera as well too,
we do various packages and reduced fees.

(27:38):
So from a Mongo DB perspective, certainly go to our university.
Nina, where do you go to learn generally?
So I, Yousaf, you said like you would add more on skills.
And so in my, in my opinion, like I choose my stack about six
or seven years ago and they didn't want to learn anything.

(27:59):
In addition, I wanted to be expert in one thing and not to
know many things, many things a little bit.
So I choose Node JS, Angular andMongoDB as my stack and they
stick with it until today. So I would say just get, choose
whatever you are most comfortable with and then try to
become an expert in that field just to learn as much as

(28:20):
possible from various blogs, YouTube tutorials, conferences,
webinars, specifications, documentations, whatever you can
find, just read it and and then you will be really competitive
on the market in these fields. Accent, that's a that's a superb
answer. Stick tightly.
We got a third one in while you were talking with Bet Simon
asking about the slide deck. I don't know, Nina, do you put

(28:43):
that anywhere? But obviously, as I mentioned at
the beginning, the recording is available afterwards up on up on
YouTube and LinkedIn. As well in the description of
the YouTube after afterwards. Perfect.
Excellent. We'll certainly take care of
that. Simon for you as well too.
So look, sorry for the interruptions.
As Linus says, talk is cheap. Show me the code.

(29:03):
Let's get on and see this in action.
Thanks, Nina. Sounds good Zack.
So I will just go through the server code.
I also prepared the fronted codeand I will publish both in the
in the GitHub repo together so you can all con it and play with
it if you would like. Perfect.
And you gave me that link, neenad.
So whilst you're showing this, I'm going to grab that link and

(29:26):
I'll drop it back in the comments on YouTube and
LinkedIn. I want to mention that I still
didn't push the code but I will push it later so.
OK link. Will be the same, the link will
be the same perfect? No.
I would start by bringing the just showing the data set that I
found on Kegel and that covers the e-commerce products.
There is about 44,000 products and you can just download it

(29:50):
from Cagle. But for your convenience, I I
already downloaded it and I parsed it as a Jason.
So it is already inside the server repo for you to just take
it is it's here and it's alreadystructured.
So the first thing for us to start with is when we have this

(30:11):
data set, we would of course like, want to like model our
database. So we need to create a product
products collection and we wouldlike to model it.
As you can see here I mostly modelled it to be one-on-one
with the data set from cargo except I added additional
property which is image embedding that we would like to

(30:34):
implement our vector search around.
Once we have that and we have our database created and set up,
I would recommend everyone to use Atlas.
We can now start with the actuallogic.
I implemented 3 functions for you here. 1 is to import data to

(30:55):
the database, one is to actuallygo and create the bettings and
1:00 is later to search the products that will be used in
the fronted up. Importing the data is really
easy, only a few lines of code. So you will just parse the big
Jason file and then just insert all the all the items in the in

(31:16):
the database. It will take around only around
10 seconds probably for all 50,000 records to be inserted.
Second part is the actual part where we will create the
bendings. But before we create the
bendings we would like to createa vector index and I would like
to show you how to create a vector index.

(31:38):
I already created one before thedemo, but basically when you
open the Atlas, you need to create a project.
I've already created one. It's called Image similarity
e-commerce. And then when everything loads,
you would like to go here to browse your collections.

(32:00):
Of course you need to create a cluster first.
And here you can see your all your collections.
You can see the products are here and there is 42,000.
You would you would want to go to address search section and
that is where the vector search should be created.

(32:21):
You can see here that I already have my vector search index
created but if you want to create your own you need to go
here create search index and it will ask you three different
things. The 1st 2 are related to address
search. The third thing is related to
the vector search. So you would like to select the

(32:43):
third option and click next. And here you need to name your
index. You can just name it, for
example index one. The fields should be like this.
The type should always be the vector.
The path should be the name of the field in your collection
where you will store the actual embeddings.

(33:06):
So in my in my case the field was called image embedding.
OK. The dimension is the dimension
that has been used by the bending model, and in our case
it is 1408. That is the dimension being used

(33:26):
by Google's Vertex AI, which is multimodal.
OK. And that that's depending which
which models you're using and and MongoDB vector search
supports up to 4096 dimensions exactly.
OK. And the last thing is for you to
choose the similarity function. As we said, there can be four of
them, but in most you can use either Cosignon or or dot

(33:50):
product. And after you finish that, you
need to select also on the left hand side on which collection
you would like this search to becreated.
And then you click next and it will create the the index for
you. I'm not going going to do that
because I already created the Enix before and after we have
the tracker search created, thenwe can continue with our actual

(34:14):
code. This function create embeddings
so you can execute it by just calling.
This endpoint which is exposed to you will do the following.
So it will fetch all the products from our database and
remember that we already imported all the products in our
database that already doesn't have the embedding added to it

(34:37):
and it will project only the image back.
We are projecting only the imagebecause we want to create the
embedding of that image. So the semantic meaning is
captured of what that product actually is.
OK. That's a really good trick in so
far as you're, you're only looking for products that don't

(34:57):
already have the image embeddingfield.
So does that help, say if that data set got updated with more
products in the future and you could set a trigger, for
example, to go back through and find the ones that haven't had
embeddings added. Yep, OK, Yep, perfect.
So here I specify the batch sizebecause we have about 50,000

(35:18):
products it would take a long 9 to maybe about one hour for all
the products to be processed. Because for each product we need
to call Google API to get the pending storm and to update the
update the document inside the mongo DB.
So I I implemented here a batch processing which will basically
create the batches of 50 products and processes the 50 by

(35:42):
50. Depending how powerful your
machine is you can create you can increase this and reduce the
time necessary for everything tofinish.
Yeah, I was going to ask why youselected 50, but it's just from
experience yourself. Basically, yeah, yeah, yeah.
I hit OK. So now we are processing all the

(36:02):
batches here and first thing forus is to fetch the actual image
from the data source and we wantto fetch it as an array buffer
and and now we want to create base 64 representation of this
image because that is the formatthat Google will accept when we
would ask Google for the embeddings.

(36:25):
And now we just construct the prompt, we specify the publisher
and the model which is necessaryfor Google typings to work.
And here we are calling the Google the Google Google API to
actually get us back the empathies.
You can see here prediction service client.

(36:46):
Let me go up in this code, it isredefined above in the start of
the code that is basically up Google official Google package
that we can use to help us do things faster.
Ideal. Perfect.
Google will return us back the predictions and inside the
predictions we will get our actual embeddings.

(37:08):
Since Google is returning like adeeply nested structure, we
actually have to access nested fields to get the extra
appendings. But after we do that we have the
final embeddings for this particular image and we want to
update the product with it. So we are updating the image and
banding field which is the fieldthat we specified when we were

(37:32):
creating the vector index in advanced.
And after all the products has been processed then we are ready
to execute the actual and searchthe our actual products
semantically. The third the the third endpoint
is created for that to search the products.

(37:52):
The we have this here a middleware that will part the
files if file is uploaded. So we can as we said we can
query both by text and by imagesbecause this is a multi model
model and the search is kind of easy.
So we have a a text query potentially or the file.

(38:14):
One of these two has to be present and we will construct
the prompt. If the text is passed to the
back end we will construct the text prompt and if not we will
construct the image prompt. And we will just call Google to
create the vector embedding of our input prompt.

(38:37):
And depending of if our input was text or image, the Google
will return it in a different field.
So it will be either in the textembedding or in the image
embedding. But as we said this multi model
will in this in this image embending and text embending bot

(38:57):
will be in the same semantic space by using this multi model
model. So bot will work the same and
here we are constructing the actual query to for our mongo
DB. The first the first stage in our
aggregation pipeline will be thevector search.

(39:17):
Then we say that we don't want exact match As we said we want
to use the nearest neighbors approach because it is much
faster and the accuracy not is probably not 100% but is it is
near 100%. Sure, sure.
Index is the name of the index to being used.
We specify how many items we want to be returned back the

(39:42):
path to our field inside the collection.
Number of candidates is the field that will specify how many
nearest neighbors the the very we will consider when returning
on the of 15. It is it is a good practice to
for this number of candidates tobe about 10 to 13 times larger

(40:05):
than the actual actual limit. And then in the end we are
passing our search embedding. So input embedding of our input
as a query vector. And this first stage will return
us back 15 documents. We don't need to return all the
fields to the front end. For the demonstration purposes.

(40:27):
We want to project only the nameand the image because we don't
want additional networking from front end to back end.
That makes sense from back end to.
Front end and that is it. That is basically the whole
business logic. So it is really easy to
implement something this powerful.
So as we can see here in one 150lines of code, everything is

(40:47):
done. Brilliant.
Yeah, super clear. So we took our data set, we
created the embeddings by sending them to the embedding
model. This instance it's Google, but
it could be any. And obviously you're free to
choose. And, you know, Atlas Vector
Search doesn't care what embedding model.
You just define it, what it is. And then when you do that and

(41:08):
you created the embeddings, you obviously have to create your
queries and your queries need togo through the same model.
And then we're going to see thisall in action now, right?
Exactly, yes. So this is a server code.
I also have a fronted code whichis here, but I will not go
through it. You can, I will, I will.
It's a simple, really simple Angular app with one component.

(41:30):
But I will as I said, the gitcaprepo will contain both.
That's it. And I threw up the links
already. And obviously as you said
earlier, Nina, once you push it and that'll all appear up there
as well too. And I know you said you didn't
want to go through the front endstuff, but do appreciate the the
branding that you've put on thisour brand police here in Mongo
DB, we'd be super happy. Appreciate that.

(41:52):
Yes. So this is basically the simple
front end up to demonstrate whatwe have built now.
This is now built in my local board.
The front end and back end up live.
So this is like a live demo. So we have in our Atlas about
40,000 products. It's all e-commerce products, so
various T-shirts, shoes, belts, watches or whatever.

(42:15):
So let us start by like searching first by text and then
we can switch or search by imageso we can start to.
Atlas there for a second just tomaybe maybe it's loaded, but if
you could pick one of the collections or one of the
documents and just show the embeddings in there for that
product just you know for peopleto get familiar, we show the

(42:36):
numbers in your presentation etcetera, etcetera.
But when it loads I will use compass is either it's already
there. That's perfect so.
Compass is already loaded here so we can see that all the all
the documents. Now we have the image embedded
populated and it is like an array of 1408 numbers which are

(43:00):
afloat. So that is basically embedding
vector for us. Perfect.
Yes, and and don't worry, nobody's supposed to understand
them or be able to read them or make sense of them.
But the embedding models. Do so that and that is why you
usually want to like in my code.This is why I added the project
step because I don't want the whole the whole document to be

(43:23):
returned because it would also include this image abanding.
So there will be additional 1000numbers passed from back into
front end, which will increase the latency.
So that is why it's always a good approach to just filter
what you actually need for the for the actual front end up.
And let us start by doing something in our front end up.

(43:46):
So let us, for example, search by T-shirt and we can see all
the T-shirts here. So we can add T-shirt for women
and we will now see all the T-shirts for women because like
semantically it recognized what are the women in the images.
We can also add for example pink.

(44:10):
So it should return all the T-shirts that with pink colors
and you can even go further likewith cartoon character.
So it is smart enough to actually add that as well so we
can. Also, and it's super fast.
I mean I know it's running locally like but you still have

(44:30):
40,000 plus items right? Yeah, exactly.
It's really fun. Wow.
I mean I mean the server locally, but the mongo DB Atlas
is being used as a database thatis not locally so it is super
fast. Sorry.
Yes, Yes, yes, yes. Yep, Yep.
So. Even better then SO.
You can see that actually works really, really good and the

(44:50):
semantic meaning is really captured properly.
We can also go with like belt for man, we can add like colour
and we can see that it works really, really good.
If anyone wants to make sure that this is not a canned demo,
and Neenat hasn't tested all of these searches himself already

(45:13):
to know that they're good, pop asearch in a comment.
Let's have a look. Exactly.
We can also try image search. I I downloaded a few images from
the Internet. So for example, we can upload
the image of the watch, we can search and we can see the actual
watches are returned, but not only the watches itself, but it

(45:34):
recognized that the watch is kind of silver.
So it returns us all the silver watches.
So I have another one for example, like this AT shirt,
like it recognized the Mickey Mouse, Mickey Mouse and
everything. So basically it works really,
really good. And it took us only like how

(45:57):
much maybe half an hour to covereverything.
So anybody can build now really amazing applications really
quickly. And this is much more powerful
than what we could do before vector searches.
We could only search by like keyword, like similarity or
something or exact match or maybe something like that with
elastic search or this is much more powerful and much better

(46:20):
user experience for users to actually describe with text what
he actually wants. Doesn't have to go through
various categories, filters and navigate the website to find
actually what he wants. He can just go and type or like
upload the image of something similar that he managed to found
and he finds the most relevant matches.

(46:42):
Yeah, and semantic searches makethings so much simpler.
Like in the past, going back to do something like this, you
would be either, you know, making sure certain words appear
in the product description, for example, or you'd be adding
extra keywords or you'd be adding extra filters in there as
well too. And the other side of this I
suppose too then is the fact that it can obviously going to

(47:06):
say on this bit, you know, we could do the reverse as well
too. In other words, you could
create, take an image, upload that to get, you know, and ask
for it. Can I have a, a text description
of this image, please, which then can be used to add text
into the description if you weren't doing semantic image
search as well, too. So there's so many variations of

(47:28):
of how you could go about this, but it's been a superbly clear,
easy demo to follow Neenad. So it's been great.
And and as I said, if anyone else wants to make sure that
this is proper and true, please pop in a an appropriate an
appropriate search into the comments and we'll use it If
it's inappropriate, it ain't getting on the screen.
So perfect. Anything else to add me not I

(47:50):
know that you worked on this andyou pulled this together in
really QuickTime and you showed us, you know, couple of files,
150 lines of code here and there, then the UI obviously as
well to, you know, you're an experienced developer, but for
somebody looking to try this outwith your repo, for example, as
a basis, how could they they could easily choose a different

(48:11):
data set and and spin this up themselves, right?
Exactly, I will document everything in the GitHub report
in the read me file. So actually having the report up
and running is really simple. Just navigate and execute 2
commands, MPM installed to install the packages and MPM
start to start server. So that's it.

(48:32):
But. Zhao took us up on our
challenge. So colourful.
Socks for men, Nina. Let's see.
How do you spell this? Colourful Colo well, in American

(48:52):
is oh, or I wish there's AU in there, but don't worry about
that Colo or FUL socks for men SOCKS.
So there you go and you got boxes of socks and everything.
Brilliant, brilliant, excellent.And we got another one which the

(49:13):
name didn't come through into stream yard, the platform that
we're using. So I do apologize.
T-shirt with the green leaf. I've got a sneaking suspicion
this person works at Mongo DB perhaps?
Yeah, I don't know if the data set would contain this, but oh,
it does. Kinda.
Yeah, maybe not bad, but look, it does other green T-shirts,

(49:34):
there's other green logos. It's it's getting it.
So, yeah, there you go. Nice one.
See, so it's not a canned demo. It's a fully live demo.
Nina J It's it's stood up really, really well.
And we did have one question. And I don't know, like the
accuracy. This is a question from
Gabrielle. How, how is the percentage of

(49:55):
accuracy? How would you address that?
Because it's not always about being accurate.
It's about, you know, when we come to semantics, it's, it's
the meaning. It's not like matching 100
percent, 100%, right? Exactly.
So I that is the accuracy is hard to actually figure out
here. I don't know if you are meaning

(50:15):
like the accuracy of what the output should be or the how much
accuracy is lost by using the nearest neighbors instead of
actually going through the wholedata set because that makes more
sense. Other than what is the
percentage of accuracy because there is not kind of accuracy in
this approach. Like, nobody can say that this

(50:35):
T-shirt is more accurate than this one, right?
True, true. I suppose Gabrielle, if you have
any other add to that, do drop it in there.
We still have a few more minutesleft on our live stream.
Umm, but yeah, and I think a lotof these things are a little bit
of, uh, trial and error. Build it.
As you can see, building something quickly is easy.

(50:57):
Umm, you know, and also you might decide, you know that and,
and we know this, some embeddingmodels are, are better at
certain tasks than other embedding models.
So it's really about choosing the appropriate embedding model
for the application that you're trying to build as well too.
One more challenge for your search, which was electric

(51:20):
guitar. Nina, I don't.
Think they in the data set has contains that, but let us try.
No, but it might be on AT shirt,right?
Maybe. Let's have a look.
Yeah, there you go. No guitar, but it found it in
the in the T-shirt. Found it on the T-shirt.
Excellent. The others are a bit funny all
right too, but yeah yeah. So here that goes back to the.

(51:42):
That is because. I didn't so guys you can also
add the additional stage after the actual search, so I
returned. I am always returning 15
results, but you can also filterout the results later based on
the vector score that is generated for the similarity.

(52:02):
So you will you will get for each return document a
similarity score that you can use as a filter for the next
stage. So maybe like obviously maybe
guitar would pass that filter but everything else would not in
this case. Yeah.
You see, that's something you dosee quite a lot in in generative
AI apps. They'll have a thumbs up, thumbs
down, you know, because you're trying to get that reinforcement

(52:24):
learning going on too in terms of results, right?
Perfect, perfect. Nina, this has been super clear.
And, you know, to go through from the theory all the way
through into the code and all the way through into a demo that
held up for this quite discerning audience that we have
today. It's it's been brilliant.
Anything else to add as we wrap up and close out in terms of

(52:47):
your own thoughts about, you know, how easy this was to
build? What you might add to it in the
future perhaps? Yeah, so I mean it took me one
day to build the whole thing. It is the the hardest part is
just going through the documentation, especially Google
One Google has always a confusing documentation for me
and compiles and multi pages andlike you need to go here and

(53:09):
there to just fit all the all the pieces together.
But other than that, like everything was, it was really
easy. Basically the the only thing you
need is data. So to find the cool data set
that you can play with that is actually important.
So that is why I always suggest go to Kegel.
Kegel is a really good website to find that.

(53:32):
Excellent. And Jad, thank you for for your
your lovely comments as well too.
We do appreciate everybody who joins us on these live streams.
We try to do, as I said in the introduction, do them regularly.
My show, this podcast live essentially, though there are
other hosts as well, too, is every Tuesday.
So do keep an eye on that. And before we wrap up, a little
plug back for the mug in Serbia.You think it's going to be in

(53:55):
February, sometime in February, you're going to pull that
together. Yes, probably in February.
And I think like the my friend Marco that is like the author of
Melwood DB book will be the the speaker.
So I would like again to to invite everybody to join whoever
is close by. Perfect.
And to find out about that and other mugs, depending on what

(54:17):
area of the world you are, just go to community.mongodb.com into
our site. There where you'll find our
mugs, you'll find our forums. You know where Nina had said he
started answering questions in forums after he built up his
experience and Stack Overflow and moved across to the forums
to say, well, look, I'm going toconcentrate on these Mongo DB
answers and everything else thatgoes on there as well too.

(54:39):
And if what Nina had showed you,as we mentioned in his
introduction, he has some articles up, up on our developer
center, as do the rest of MongoDB, our developer relations
team, our product engineers, et cetera, as well, too.
So if you want to learn more, the simplest place to go to
isdeveloper.mongodb.com and you'll find everything.
You can use a filter in there tolook for generative AI.

(55:01):
You can look for the tech stacksif you're, if you're node, if
you're, you know, whatever you want to be, jump in there and
filter by those as well too. So they're the the two good
links that we'd like you to takeaway from today.
Show community.mongodb.com for everything to do with our mugs
and our meet ups and everything we're doing out in our developer
community, which is superb. And also developer.mongodb.com

(55:25):
to learn more. Nina, any final words for for
our audience today? And I just have to say thank you
so much. You made my job super easy with
all the effort and work that youput into this stream for me.
Yeah. I also just wanted to say like
thank you for everybody that waswatching us and I hope that you
will like it. Perfect.

(55:47):
Well look as we posted earlier in the comments and they will
stay put. The repo that Nino or Nina will
push to later will have everything that you can go in
and play around it yourself. I'm assuming you're looking for
any comments, any peers there aswell.
If people want to improve on what you have as well too, you
happily take those on board. Nina of.

(56:07):
Course it will be public people so anybody can contribute.
Excellent, excellent. Well, listen, that's been great.
It's been a pleasure to have youon board the live stream.
I know you've joined us in person before.
I'm sure you'll be back again inthe future, Nina.
So this has been a pleasure to have you on board.
And to all our viewers who join us on LinkedIn and YouTube,
thank you so much. And somebody has asked, is the

(56:30):
recording available as soon as this stream ends?
Well, give or take a couple of minutes, you can go back into
LinkedIn and go into the events tab and you'll see past shows
and you can watch this again. Or you can go up to Mongo DB's
YouTube, It'll appear there as well too.
And I do believe as of recently,we've got a Mongo DB podcast
live playlist up there as well too.

(56:52):
So this and everything else we've done over the last nearly
two years now should be available up there as well, too.
So that's the end of the plugs. Nina, thank you so much for all
the effort and how clear you made everything.
So really appreciate it. I hope you enjoyed it.
I enjoyed it. Thank you for the invitation.

(57:12):
Excellent. So for me, Shane McAllister in
Dublin, Mongo DB's office here in Dublin, Ireland, to Ninad in
Serbia and to all our viewers all around, thank you so much
for joining us. Keep an eye out for the next
streams that are coming your wayfor more massively entertaining
and informative content such as this.
Ninad, thank you so much. It's been a pleasure.
Take care everybody. Bye bye.

All Episodes

EP.260 Vector Search Secrets Revealed! - AI-Powered Image Search with MongoDB - Live Demo

Episode Transcript

Popular Podcasts

Stuff You Should Know

Dateline NBC

The Bobby Bones Show

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}EP.260 Vector Search Secrets Revealed! - AI-Powered Image Search with MongoDB - Live Demo