Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
My special guest is Ben Gutkiewicz, and hopefully I'm pronouncing your surname
reasonably accurately.
Co-founder of Superlinked, a company revolutionizing the way organizations build
machine learning powered software.
So Superlinked excels in transforming complex data,
(00:21):
ranging from text to video, into what are called vector embeddings,
which hopefully Ben will explain a little bit more about what that means later on.
But that's the core of AI data processing at an unprecedented scale.
With a $9.5 million seed round led by index ventures and theory ventures.
(00:45):
Superlinked is poised to meet rising market demands and expand its product capabilities.
So we'll explore Ben's journey, the evolution of Superlinked and insights into the future of AI.
So, Ben, welcome to today's show.
Thank you, Gary. Great to be here.
Great to have you. Can you elaborate on the primary value proposition of Superlinked?
(01:12):
What problems does your product solve for your users? Absolutely.
So we are a platform for machine learning engineers, data scientists at the
moment with the vision to support software developers, even in building those
applications in the future.
At the moment, the goal of our platform is to help those data scientists and
(01:38):
machine learning engineers to launch GenAI-powered applications.
And not only, but the focus now with all the hype is around generative AI,
but launch those applications to productions quickly while retaining control
over the data they send to those generative AI models and control on the quality of the results,
(01:59):
improving that quality based on their understanding of the business and the users.
Maybe just to elaborate a little bit on the vector embeddings.
So vector embeddings is basically the language of generative AI models, large language models.
Everyone is familiar with GPT from OpenAI and other models from Anthropic and
(02:20):
from Mistral and other companies which have been in the news.
The way they work and the way they work fast as well, if you experience chat
GPT, is that they turn all the data text into vector embeddings.
And what is a vector embedding?
It's basically a representation of data as a string of floating point numbers.
(02:41):
Now, computers work really well with floating point numbers,
much better than with keywords, which, you know, traditional search is based on.
And so those applications are able to retrieve the right answer very quickly
using algorithms such as approximate neighbor search,
which is basically an algorithm to find vectors in a large vector space,
(03:04):
multidimensional vector space.
So where we come in is that we help turn data which is unstructured using those large language models,
but also structured data which the enterprise really relies on in their use
cases and their business case to the same vector embedding,
(03:24):
helping them actually control what goes into the vector embedding and retrieve
the information based on their use case. I can jump into an example.
So if we speak about an e-commerce website, every product has an image,
which could be considered unstructured data, a description, again,
paragraph of text, which is also unstructured data.
(03:46):
And for that, they can use large language models to turn this data into vector embeddings.
However, for each of those products, there is also a price, which is a structured
data. There is a level of discount.
There is a brand, a category, a season, perhaps.
Some data which is not even visible to the user, like the margin they make on
(04:07):
the product, like the stock levels of that product.
And they might want to take all this data into account when recommending the
right products to the right users.
Now, can you walk me through your product's evolution from a full-stack,
real-time recommendation or personalization solution to more of a horizontal platform?
(04:30):
And how did the launch of ChatGPT influence or assist in this transition?
Absolutely. So we started and, you know, a lot of that goes to my co-founder,
Daniel, who worked as a tech lead on YouTube, working on ad prediction models, right?
So he was helping advertisers on YouTube understand how the ads will perform, right?
(04:56):
How many users will see them, how many will click on them, etc.
And those models back in, you know, six,
seven, eight years ago, they've been using vector embeddings already at Google,
and they had the whole infrastructure and tooling to use those vector embeddings
in this and other context as well.
Google Search was powered by vector embeddings since 2017 already, right?
(05:20):
So that's something that the big tech companies have been using for a while.
Now, coming out of Google, starting to work on Superlink, building our recommender system,
Daniel brought this understanding or actually understood that those tools that
Google has been using internally are not really widely available to the public
and definitely not to enterprise client with limited technical capabilities.
(05:45):
You know, the ones that don't employ a thousand machine learning engineers to
actually use in their products and make their experience for the users much more attractive.
And so we started from building a recommender system using those techniques
and started working with clients to implement it in production.
Now, and that was still before the ChatGPT actually came out. That was in 2022.
(06:09):
Once ChatGPT came out, the focus and the hype around large language models and
generative AI, of course, exploded, right?
To the level of me speaking with my hairdresser about, you know,
ChatGPT and how they use it in advertising, et cetera, right?
So everyone is pretty much familiar. It is the application that achieved 100
(06:33):
million signed users, the fastest in history, right? I think just a couple of months.
And with that in mind, we actually learned that people are trying to build those
systems, but really struggling.
So trying to build recommender systems, semantic search, analytic system based
(06:54):
on large language models, while taking into account the other data,
which is crucial to provide high quality results.
And we learned that actually a problem that we solved in our full stack recommender
system is a big problem that many enterprises actually struggle with.
And then talking to investors and talking to executives and large organizations
(07:17):
that have been struggling with this challenge,
we decided to go on unbundling that solution so that we don't provide a full
stack solution for the client where they might already have some pieces of the stacks.
For example, our solution had its own database built in, while an enterprise
client might have another database they want to use because they already have
(07:41):
the contract, because they already have their data in that database.
So to do that, we unbounded the product,
expanded its capabilities in terms of not only serving recommender system use
case, but also a bunch of others,
and made sure that we can ingest data from multiple sources and store data in
multiple types of databases based on the use case and the desires of the client.
(08:07):
Got it. Now, you recently raised
a significant funding round led by index ventures and theory ventures.
What did you and Daniel do to meet the right investors at the right time?
Absolutely. So the timing was right, as we discussed earlier,
because of the launch of Judge EPT and the hype of LLMs.
(08:29):
And us actually building a system based on those principles,
which others were trying to use to build their own solutions, right?
And then I think one aspect that helped us, we actually raised a pre-seed round
in 2022 from investors in the UK, MMC Ventures, Episode 1, and Concept Ventures.
(08:52):
And we still had quite a bit of runway, and we were not in a rush to raise.
But our investors, especially from MNC,
encouraged us to go and meet investors in the Bay Area, based on their experience
of those meetings and based also on the kind of funding that's been happening
around large language models and generative AI at the time.
(09:13):
And so we went there, actually Daniel went there.
I was in the middle of integration with a relatively large client of our system.
And so the first thing was to.
Make some introductions right and those introductions mainly came from
the investors and then the idea was okay you get
the first meeting you you impress the person you speak with
(09:36):
you ask for the next introduction so this was this kind
of snowball of introductions if you like as long as you make a good impression
and and watch the vision that you outline make sense to the investor you you
can get other introductions and that was basically the way we got to tom Tungus from Theory Ventures.
(09:58):
Now, Tom has been a managing director of Redpoint, also a VC fund for 14 years,
and then set up his own fund, which he called Theory Ventures in 2022.
He's also well respected in the technical circles in the Bay Area.
And when Daniel met him, he introduced him to Stanford professor and several
CTOs of those large enterprise companies,
(10:20):
which were struggling with a similar type of problem once
they said yeah this is a problem that we have and
this is a complex problem to solve he was on board with the investment and then
from there on it was very very easy pretty much once he said that he is interested
we started receiving calls from pretty much everyone who wanted to get into
(10:41):
the round and at that point was daniel located on the west West Coast,
or was he just there for a short trip?
So he went just for a short trip. The goal was, let's chat to some investors,
build relationship, let's chat to some companies, potentially get them as clients,
not really to fundraise.
And I think that helped us as well in terms of mindset.
(11:04):
But he went there for two weeks and eventually as things started to snowball,
he extended by another two weeks.
And then after we actually raised around, we decided to move the company to the US. Okay. and.
You've scaled your team in Hungary and Israel. You're based in London.
Daniel's based in Silicon Valley.
(11:25):
Why have you chosen those disparate geos for scaling?
And perhaps just as importantly, how are you maintaining alignment and cohesion
within your team working on such advanced technologies over so many different time zones?
(11:45):
Absolutely. So look, we started during the pandemic, right?
And that was the era of let's get the best talent wherever they are, right?
And that's how we started working with developers in Hungary.
When we were looking for our first developers, we actually discovered Hungary
to be one of those undiscovered gems of Eastern Europe, right?
(12:07):
I mean, obviously, you know, Poland, Russia, Ukraine, Russia maybe used to be,
but Ukraine is still quite a strong hub for technical talent.
Hungary doesn't actually feature that much, but we were able to find great talent there.
Actually, our first developer is now our engineering manager in Hungary.
(12:28):
So that was a discovery. And once we understood that we are going to build a
deep tech company, and that actually happened with the recent round,
we decided that we want to go for a hub type of approach.
So people are co-located and can collaborate, can brainstorm,
can get together in front of a whiteboard.
(12:49):
And can actually get to a solution which you know is possible in in remote but
it's harder so we decided to
go for the hub strategy and then obviously strengthening the hungarian hub,
but also opening the israeli hub for particular type of talent there there is
a certain type of machine learning engineer who is also a software developer
(13:12):
which we struggle to find a lot of in Hungary.
And so me being originally from Israel, decided to open Israel as a second hub.
Now, the way we manage it is that London is our business and operations hub.
So salespeople, marketing people, operations, we are hiring in London.
(13:34):
Obviously, London is a great place for talent as well.
Hungary became the core engineering team.
So they're building the infrastructure or the platform which is generalized for all clients.
And Israel is more around services and building kind of very complex, we call it spaces.
(13:57):
So for each of the different data types that I mentioned before that you might
want to vectorize, we actually build a space which is a model from small to
relatively complex that actually converts that type of data into vector embeddings.
And so Israel, given the level of talent we are hiring there,
is going to be that kind of research hub, but also client services,
(14:19):
supporting clients, also given that a lot of companies from Silicon Valley have
technical representations in Israel and R&D offices.
And in terms of how do we manage this whole remote team, it's not easy, I must say.
But one of the things we do is make sure that we write everything down.
So we write a lot of documents, we use Notion as our knowledge base,
(14:42):
and everything has to be documented so that anyone in any hub can go and check
and make sure they understand what has been discussed and what has been decided.
Now, the hubs make sense in terms of building out skills and culture in each location.
(15:02):
And I get, you know, Notion and Zoom can make a big difference.
But how are you building cultural alignment across the whole business?
You can't do that just with hubs and just with Notion and Zoom.
Absolutely. So one of the things we've been doing from the start is actually
getting everyone together.
And now that the team is bigger, it's obviously getting a bit harder,
(15:26):
but every four or five months, we gather together in one location,
we spend somewhere between four and six days together, working,
discussing, talking about culture.
We've done our latest get together in April in Hungary.
So, we get everyone together, we talk about culture, we kind of restart the
(15:48):
relationship, if you like, or re-engage, re-energize those relationships,
and then continue and work somewhat independently each in our own hub.
So that's been very important. And we make sure to talk about those things.
Yes, Zoom, but that's also something.
Sure. What are the key components of educating your audience?
(16:12):
It's a highly technical, highly sophisticated audience.
And how does VectorHub play into that strategy?
Absolutely. So from the start, we raised around towards the end of 2023,
towards the end of last year.
And the first thing we did when we knew that we were going into that more kind
(16:33):
of hypey and maybe somewhat chaotic large language model and generative AI space,
we decided that we want to organize the knowledge around the usage of vector
embeddings, the usage of large language models.
Models, how do you build AI apps or generative AI apps?
And we set up an educational resource and a community around this of basically
(16:58):
kind of shaping the way or communicating the way we think to the broader audience.
I mean, because of all the hype, everyone is now an expert on generative AI.
Everyone is an expert in LLM. There is a lot of content. There's a lot of approaches.
And so we want to make sure that the way we are thinking about this challenge
(17:20):
and this problem is communicated in a structured and clear way,
first of all, to our audience or to our clients, but also more broadly.
And the way we decided to do it is instead of us writing all the content,
actually partner with influencers in the space globally.
First of all, as a test, can we convince them that our approach makes sense? And I think we did.
(17:43):
And then enabling and supporting them with actually creating content around
some of those main challenges that people in that space, technical,
sophisticated people deal with at the moment.
And are you doing anything else to
kind of build a wider community around Superlinked and around VectorHub?
(18:04):
Absolutely. So I think one thing that really helps is our open source.
So in March, we decided to open source our basic product.
And the goal was, again, to build that community, to make sure that people understand
that we're not selling them some kind of snake oil, which there is a lot of
(18:24):
these days, right? So people understand what we build, how we build it, and why it makes sense.
So we are building that community around the open source. We are building the
community around the vector hub.
And we are, through the influencers, engaging their audiences around the topic as well.
(18:45):
And so we're still not running a full community because we think this is something
that you want to do right. And that requires a lot of dedication and someone
who is actually dedicated to running and engaging that community.
Otherwise, those communities don't really survive and thrive.
And so we're not still running a fully open community for everyone,
(19:08):
but that's something that we definitely have as a plan.
But we run a community for our contributors to VectorHub, which they really
value as well, and which helps us also communicate our roadmap and our vision
and through them communicating to the broader audience.
And of course, engage with the people that try out and play or maybe have questions
(19:32):
around our open source solution.
So you have your influencers, you have your open source solution,
you have the community that you're beginning to build.
Another part of your strategy, as I understand it, is to build out strategic partnerships.
So can you discuss some of the partnerships you've formed and how that fits
(19:56):
into the large ecosystem of AI and machine learning tools?
Of course. And, you know, what we did between those two rounds,
the pre-seed and the seed round,
we actually unbundled our solution, which means that actually we became a relatively
narrow part of the modern data stack, right?
(20:16):
We are that vector compute layer that helps organizations turn complex data
into a vector embedding, whether it's structured or unstructured.
However, once you create the vector, you first of all need to store it somewhere, right?
And we worked with RADIUS for the past three years. RADIUS is one of those enterprise-grade
databases that also has a vector search solution. So we've been working with
(20:39):
them for about three years already.
On on our recommender system and so there naturally
they become one of our first partners we also partnered with
mongodb which is another well-known well-respected
database with a vector search solution so that's for storing the vectors we
also work on the ingestion of the data so how do we bring the data in from multiple
(21:02):
types of sources and multiple types of other databases And so we work with Starburst
on the data ingestion side.
They're a kind of abstraction layer. So they can take data from all the different
sources and they can send it to
a superlink platform for vectorization and storing in the vector database.
And then Dataiku is another partner who is actually working with a lot of our clients, right?
(21:27):
So they are this kind of platform for data scientists. And so we want to make
sure that data scientists working with Dataiku are able to use Superlinked as
well as one of the models to help them turn data into vectors.
Of course, we also collaborate with many other vector databases,
but those are the four partners that we've prioritized to start with,
(21:49):
also based on the relationship and also based on the type of broad appeal they
have in the industry. And let's go on a journey.
Let's imagine that it's May, June 2027.
So three years from now, what will Superlinked have become?
What will you and the team have achieved in those three years?
(22:12):
Is absolutely so we what
we want to become or what our our goal is
first of all our vision is that vector embeddings
not only for generative ai application but rather to
pretty much well maybe not everything but but a lot of things you do with machine
learning and ai vector embeddings are going to be the common language of that
(22:37):
and and the reason again there is a lot of vector database based solutions in
the space, there's a lot of models you might want to bring,
and computers work better with numbers, right?
We believe that vector embedding will become that common language of machine learning.
And we want to become the standard for turning data into those vector embeddings, right?
(22:59):
So whatever your use case is, and we currently focus on four use cases,
but of course, there is dozens of other use cases for using vector embeddings,
whatever the use case is,
we want it to be as simple as clicking a button or marking a checkbox of turning your data into vectors,
through superlinked integrated with databases integrated with data science platforms
(23:24):
like dataiku etc so that that's our goal anyone who is turning data into vectors using superlink.
And who are the three people, the three entrepreneurs or technology innovators
who've most inspired you to be who you are?
It's a great question. I think Sam Altman will probably be one.
(23:48):
I mean, with him, it's just the clarity of the vision.
I attended a session with him around 2014 or 2015 when he was talking about
those micro nuclear reactors.
Which got me really, yeah, really excited.
(24:09):
And I was like, wow, this is like really long term visionary thinking.
And, you know, we now see with the things he does with open AI.
Eye that's that's definitely there's definitely
an interesting person that i follow and i
listen to the podcast and it's always super super interesting
the kind of thinking he has the long-term thinking i
(24:31):
would say you know elon musk he has some side
or something they're not exactly best friends today but
carry on yeah yeah no exactly
exactly but you know i i read his biography
photography and i mean just you know
the vision and the passion and and the resilience he showed
in his early career and how he kind of went all in on spacex when they failed
(24:57):
and failed and failed time after time with their launches and eventually made
it work and made it to to the big and valuable company that it is right we'll
not talk what he did to twitter which i think maybe is not Not so great,
but I think, you know, early on, definitely one of those figures that was inspiring for me.
(25:17):
And, you know, maybe a bit trivial, but, you know, Steve Jobs,
right? Again, this kind of story of...
Passion and execution and and kind of going through obstacles again more more
in the early career and and something that's definitely inspiring and i think this focus on design,
i think i see it a lot in my co-founder as well i mean some of those combinations
(25:41):
one of the reasons that we started working together back in 2021 right is that
he has this kind of vision vision, and idea of how the future will look like,
and just paints the picture in such detail that you can really believe in it.
And the second is this whole attention to detail, attention to design,
(26:02):
making sure that things are really well positioned, and also as a wording,
but also in a graphical way, which is visually appealing.
And talking of your co-founder how
are you and daniel dividing up your responsibilities
because you you mentioned they're on
about the hubs in israel and hungary what they're responsible for you walked
(26:26):
me through what happens in london in terms of operations sales and marketing
so beyond investor relations what what is kind of daniel's area of focus and
how do you and he split out your responsibilities.
Of course so so daniel is is definitely on
the product side of things him being a practitioner in
(26:48):
the space for many years great understanding of
the space and the ability to to also design
this vision for the space makes him a
perfect you know cpo or chief product officer so
it's definitely an area where where he he
leads the technology side of things and so
kind of how do we take this product vision
(27:11):
and translate it to the technology to the eventual development that we do that's
something also that daniel is taking the lead on and i'm taking the lead more
on the business side of things right so go to market marketing and sales operations of course finances.
(27:32):
I think we're not at the level where we need to see a full yet but you know
i'm acting as one for for sure.
So those things are things that I help manage.
And again, because we split the team this way, it actually works pretty well,
because my team is mostly in London, and we make sure to meet at least once
a week and work together.
(27:53):
Daniel, again, a bit of a challenge, given that he is now in San Francisco,
but he's mostly working with the Hungarian team.
And for the For the Israeli team, I actually brought in a friend of mine still
from the army days in Israel.
Most of the people you know in Israel are from the army days,
because that's kind of this environment where lots of strong relationships create.
(28:17):
And so I brought a friend of mine from the army days to be the lead of the Israeli
hub, and he's relatively senior, so he can be somewhat independent,
but of course, we all collaborate because he's working with clients,
I'm working with clients, Daniel is thinking about the product vision and the
roadmap so there is a lot of that collaboration which means somewhat,
early hours for daniel and somewhat longer hours for us to be able to collaborate
(28:41):
with him in the evenings in the uk time well it's been great getting to know
you and to find out more about,
superlinked and the great team that you
and daniel have been building hopefully in 10
years time you'll be spoken about in the same revered tones as as elon musk
(29:01):
and sam altman and i wish you and the team a wonderful second half to 2024 thank
you very much gary it was lovely speaking with you and you know i hope what
you say will actually happen.