All Episodes

March 8, 2021 • 39 mins

Summary

There are a large and growing number of businesses built by and for data science and machine learning teams that rely on Python. Tony Liu is a venture investor who is following that market closely and betting on its continued success. In this episode he shares his own journey into the role of an investor and discusses what he is most excited about in the industry. He also explains what he looks at when investing in a business and gives advice on what potential founders and early employees of startups should be thinking about when starting on that journey.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Tony Liu about his perspectives on the landscape of Python in the data ecosystem from his role as an investor

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by sharing your background in the data ecosystem?
  • What led you to your current role as a venture investor?
    • What is your current area of focus in your investments?
  • What do you see as the major strengths of Python in the current landscape for data and analytics?
    • What are the areas where the ecosystem is still lacking?
    • Where are you seeing growth in the space and what do you see as the motivating factors?
  • As an investor, what are the qualities that you look for in a startup that is trying to compete in the data ecosystem?
    • What is your process for learning about and identifying companies that demonstrate the potential to succeed?
    • Do you focus on a particular problem domain and research a grouping of companies that are focused on that problem, or do you start from a given company to determine where to place your bets?
    • How has COVID changed the competitive landscape?
  • Can you share some of the companies that you have invested in?
    • What was noteable about their respective businesses that provided you with the confidence that they were worth investing in?
  • What are some of the most interesting, unexpected, or challenging lessons that you have learned from your experience as a venture investor?
  • What are some of the companies that you are keeping a close eye on, whether as potential investments or as competitors to your existing portfolio?
  • What are some of the problem spaces that you would like to see companies try to tackle?
  • What advice do you have for engineers who might be considering building a new business?
    • Do you have any advice for engineers who are working at a startup as to how best to compete in the current market?

Keep In Touch

Picks


Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Unknown (00:13):
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great.
When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So take a look at our friends over at Linode.
With the launch of their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform,

(00:34):
including simple pricing, node balancers, 40 gigabit networking,
dedicated CPU and GPU instances, and worldwide data centers.
Go to python podcast.com/linode,
that's l I n o d e, today and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

(00:56):
Your host as usual is Tobias Macy. And today, I'm interviewing Tony Liu about his perspectives on the landscape Cosimo
Ventures,
which is an early stage seed and series a fund. At COSIMO Ventures, which is an early stage seed and series a fund investing across the stack. I primarily invest in data infrastructure and machine learning infrastructure companies. And do you remember how you first got introduced to Python? It was in my first ever computer science class in my undergraduate degree. It was a Java class,

(01:27):
and the last class was like a bonus section on Python. But the funny thing now is that the class, I think, is primarily taught in Python. Yeah. It's funny how a lot of the coursework has flipped from being Java heavy to being Python heavy because of its approachability and the fact that it has grown in terms of its overall use in the industry.
Yep. Absolutely. The increase in usage just over the last 5 to 10 years has been crazy. I don't think people 10 years ago would have seen this. Yeah. But I think my first programming class when I was doing my degree was in c plus plus and then we went to Java.

(01:58):
We actually c because I did computer engineering, and I actually had to find Python on my own sort of toward the end of my degree program.
It makes sense. Once I did the c intro class, I decided that software engineering was not for me, and I wanted to stick with statistics.
And so digging a bit more into your experience with Python and some of the backstory of how you ended up today, can you give a bit of your background in the data ecosystem

(02:23):
and why that's a particular area of interest for you now? I was always interested in data. Early on in college, the thing that really got me into data was sports analytics, to be honest, specifically
NBA analytics, which at the time was very immature.
And I spent, you know, summers working on this with, you know, professors in the statistics department,

(02:43):
and it really got me excited about, you know, the applications of data essentially.
And since then, my first industry experience was working on core search ranking at Amazon.
In hindsight,
I didn't know, like, how nice
people had it there. Everything is just easy to use. I've learned the hard way that actually doing data in the real world is really hard. After that, I joined a small startup at the time, Turo,

(03:07):
where I ended up building their
ID fraud model, which essentially stopped
more cars from getting stolen. So that was a cool, like, real world project to work on. But I found myself wanting to get, you know, closer to the business side of things. And
before joining Cressida, I was a product manager at Datarix,
where I led the workspace, which is the interactive data science platform, and a few other data science initiatives. I was a product manager for Koalas

(03:32):
as well, which is the Pandas compatible
Spark API. Given your interest in getting closer to the business, that make sense that you would end up as a venture investor. But I'm wondering if you can maybe give a bit more context about how you ended up where you are and maybe some of the selection process that made you decide to end up at Costanoa versus 1 of the other funds? I think that that's exactly right. Venture vesting marries 2 of my, like,

(03:58):
main interest data and, you know, understanding how businesses work.
I think in venture world, you have to kind of grasp how the entire ecosystem works. I think there will be less depth in your knowledge. I probably was much deeper in 1 particular part when I was in Databricks than I am in any part now. But it's a really fun intellectual challenge to try to piece together things that half the time, things still don't really make sense to me. There's just so much overlap and so much going on in the entire data infrastructure

(04:28):
ecosystem.
In terms of
how
I successfully landed in this role,
I'd say that my background is probably less traditional for a typical VC.
I think you'll probably see more people come from finance and, you know, consulting backgrounds.
Though for early stage
venture firms, I have noticed that there are more product managers entering the space.

(04:52):
And my
sense is that it's because
today, when you're investing in, you know, a seed stage company, series a company,
because of how the markets evolved, oftentimes, they don't really have that much
revenue or even product.
So what's really left is to understand
how the product works, to understand if customers really like it, to get a sense of the market.

(05:15):
And I think that,
you know, this background actually lends itself decently to that type of endeavor.
In terms of picking, you know, co signer specifically,
you know, there are, I'd say,
not that many opportunities in BC,
and each opportunity that comes up has a very specific person in mind and, you know, vice versa. And I think it was a great

(05:37):
situation where I was you know, they're looking for someone with a heavy data infrastructure background and wanted to focus on it versus other funds that were looking for more general investors.
So that's how I ended up landing in my specific role.
In terms of the actual day to day,
what is involved in actually working as a venture investor, particularly

(05:59):
somebody who is focused on the data ecosystem? And how do you keep up to date with the opportunities in the market? Yeah. It's a great question and, you know, something that I think we all try to continually figure out.
I think that
different people have very specific approaches to venture investing, and it takes a long time to figure out how you wanna approach it. Even now, you're always figuring it out. In this particular case,

(06:25):
my, you know, preference is to go really deep in a space,
so to be very intentional with how I spend time. And to keep up to date with things, I, you know, regularly chat
with companies. Obviously, new companies are forming, but also with, you know, people who are working at, you you know, later stage companies or even public companies.
So in the data space,

(06:46):
data robots, Databricks,
these are all great people to talk to who have
deep insight into
their particular focus area. And the more you dig in, I think, the more you understand
the types of problems that exist and also might even run to different projects that are really interesting to pursue.
So in the latter case, talking to data scientists, data engineers

(07:08):
is actually also very helpful in a part of my, I wouldn't say day to day, but, like, do a lot of these conversations on a weekly basis.
In terms of Python specifically, it's obviously a very strong contender in the data science ecosystem, and I've been seeing it leak into other aspects
of data. But from your perspective, as somebody who has

(07:29):
worked in the space and been a product manager for a company where Python was actually 1 of the core projects that you were involved with and somebody who's working as an investor
who is talking to data scientists and deciding whether to invest in these various companies.
What do you see as being the major strengths of Python in the current landscape for data and analytics?

(07:49):
Yeah. I think there are several things.
The first 1 is that it's very flexible,
especially when compared to something like SQL. You get the types of, you know, computational analysis you can do, machine learning, like and even, like, data wrangling is just something that is much harder to do in SQL.
Second part is that it's a very usable language. I know several people who, you know, came from a SQL background, and with that understanding of SQL were actually able to pick up Python without too much

(08:18):
difficulty. So those are, you know, 2 parts in the usable side. And then the third part, I guess, is just that it's exploding
popularity probably because of those 2 things as well as the ecosystem that has grown. So I think all this in aggregate make the foundation a very usable
and, you know, rich environment to work in. Are there any particular

(08:39):
problem domains or use cases that you see being used for more often than others? And for the cases where
you don't see it used as often, what are some of the other dominant languages?
So in terms of use cases where, you know, it's more prevalent, I think that the obvious 1 is data science workloads. Like, whether it's, you know, computational analysis,

(09:00):
coming up with charts that much more sophisticated than what you can do in some BI tool
and also in machine learning. I think that, you know, where it's lacking and expanding into are in the dot engineering land and also in the BI space. I think that a lot of Python users actually also know SQL and view them as interchangeable when they're doing work. And there is more and more tools coming out

(09:26):
that kind of support both as almost first class, that push support for SQL
so that people can have this more blended experience. Now on the other hand, there's
more tooling coming out to make Python, more robust, like, data engineering platform,
whether it's a stuff that's
an example of a project like Faust,
which enables, like, Python stream processing.

(09:47):
It's filling the gaps that, like, make it a ETL platform on par with Spark. I've also been seeing it come up a lot in the data orchestration and workflow management space where
in the initial stages of the big data revolution
that with the Hadoop ecosystem, there were projects like Uzi and Azkaban that were written in Java that tied into the Hadoop platform.

(10:10):
And then Airflow was kind of the breakout success in the Python ecosystem for a while, and then that is being somewhat superseded by newcomers such as Prefect and Dagster as well.
Yep. That makes sense. That's an area that's
received a lot of attention. And I think along those lines of, like, maybe abstracting away DevOps from the data science experience,

(10:32):
where you have the tools, like, on the machine learning side, like Kubeflow,
MLflow, Metaflow that aim
to help data scientists focus on just the machine learning, the actual, like, data science part of their work, rather than having to care about the other parts of ensuring reproducibility,
ensuring scalability.

(10:53):
And in the broader data ecosystem, are there any areas where you see Python support still being fairly lacking and that you see potential for growth in the coming years? I think that there are a couple of themes that I look out for rather than specific use cases.
I'd say 1 is collaboration.
I think so much of Python tooling

(11:15):
has been built for, you know, single single player mode. For example, if you think about Jupyter Notebooks, wildly popular, you know, very useful and powerful,
but it's designed to work with your laptop and with your local file system.
And when, you know, you take that tool and just, you know, host it in a cloud setting, there are all sorts of challenges that arise because of what it was initially designed for. So I think that there's a lot more tools coming out now that are

(11:44):
bringing
collaboration,
like making it a first class citizen in the Python experience, which has just been lacking, especially in the early days of Python.
And another
aspect is related
to Python tools being designed for, you know, your laptop
is that a lot of tools are designed for single node usage.

(12:05):
So when it comes
to scale, when it comes to scaling your code to, you know, larger datasets,
that becomes very challenging. And in many of these cases, people might, you know, push the limits of pandas and decide that they need a new solution and have no choice but to go to somewhere, use something like Spark, which is, you know, a completely different architecture and, you know, that has more limited support

(12:28):
for Python.
So I'd say that on 1 hand, there is the collaboration,
there's scalability,
and I guess the 3rd challenge is just around
making it really usable. I mean, if you think about package management today, it's always a nightmare when people are migrating from
1 environment to another,
and this greatly impacts collaboration. So I think this whole overall

(12:52):
experience that needs to be reimagined, but the core is there. Like, Python is extremely usable. There are many great libraries
that people use every day to do their work, But now it's about tying those together in an enterprise setting. In terms of the
growth in the space,
what do you see as being some of the motivating factors for Python finding its way into some of these other areas that have largely been dominated by other languages. With data in particular, it's been a lot of Java, but

(13:22):
in the high performance computing or real time machine learning. There's been a lot of, like, Fortran or c plus plus code, but Python has been able to edge its way in because of a lot of the
integrations that it offers. But what do you see as being the driving factors that push people to trying to use it in these different use cases that have traditionally been a little more difficult? Yeah. I think it's really

(13:47):
tied to
both a huge growth
adoption, and as we mentioned, like, CS classes are now teaching Python first. There's just a lot more people who are learning Python at an earlier stage and, you know, extensive ecosystem
around Python
that just makes it so flexible
that when you think about having an overall workflow that makes sense, you would like tools to be compatible and to be, you know, interoperable.

(14:12):
And, you know, given that Python is emerging as the, you know, de facto language for data science,
presumably, you'd want the tools, use cases
that, you know, touch on data science to also be compatible with Python. And I think too, 1 of the
driving forces is kind of in a couple of directions where,
particularly in small companies, a lot of data scientists end up having to do their own data engineering. And so they don't want to have to jump over into a Java framework. They just wanna stick with the Python they're familiar with. And then in the other direction,

(14:43):
data engineers want to be able to collaborate more fluidly with the data scientists. And so having all of the tooling in Python makes that an easier bridge to cross because everybody's working in the same language without having to do translations back and forth. Absolutely. I think that makes a lot of sense. And, you know, in

(15:06):
business analysts and other, you know, potential
other less technical users who know SQL into the fold. In your role as an investor,
as you're looking at different companies who are operating in the data space,
what are some of the qualities that you look for in a given startup that's trying to compete within that ecosystem?
There are, I think, lots of things to look out for. I'd say that the ones that are top of mind are for the product that they're working on. Data scientists have to love that product. If it's a product that they are spending a lot of time in on a daily or, you know, weekly basis, I think that is a starting point. And then I think as from an investor perspective,

(15:45):
you also want to know the problem that the company is solving at the early stage
is
a significant enough wedge into some platform opportunity.
So
if the product is
solving a critical problem and is also widely used by heavily adopted by individual data scientists, then that's a great sign that there are probably ways that this product can

(16:10):
expand into
more parts of the workflow.
On a similar note, like, if the product is core to the the data infrastructure
and also, you know, touches the end user experience, that's also a good sign in terms of being able to
expand your platform opportunity and move up the stack and build more applications.
And then probably the most important 1 point actually is the team. I think there are so many

(16:36):
companies
that are emerging in the space, and understandably so so, you know, you see, like, pretty crazy evaluations in
data infrastructure space these days.
So and it sounds cliched as well, but you have to believe that
the team has some kind of unfair advantage over others.
And this could come in many forms. It could be the case that, you know, this team has built this thing at some large organization before and where it's been battle tested, and now they're able

(17:04):
to bring that same experience
to the rest of the world.
It could be the case that, you know, it's the leader of an open source project, and that differentiates, you know, that founder from other people. So once you have
identified a potential company
and
decided that you want to dig a bit more into

(17:25):
what they're building and their overall potential within the market.
How do
you approach that company? How do you go about learning
more about their
opportunity that they're pursuing,
the product that they're trying to build, and just getting a better understanding
of the
area in which they're competing and some of the

(17:46):
either market factors or competing businesses
that they're trying to
operate within the constraints of? Yeah. So I think there are multiple levels to this. I think there was, like, the initial kind of selection process of which spaces or which
projects or companies that you wanna focus on, and then there's
progressive layers of getting into more depth. So at the top level, as I said earlier, like, try to meet people that hear about projects.

(18:12):
You know, I also keep up to date with, you know, many data
publications
and
communities.
Once I've identified some company or project that I'm really interested in, the most obvious thing to do is just to talk to the custom talk to the users, talk to the customers,
and see how that product
fits into, you know, a data scientist workflow,

(18:34):
if it's a product that he or she loves using and uses every day and, you know, kinda replaces something else.
So that type of qualitative information is really important.
There's also the part that's around digging into the community, whether that's going to some you know, in the case of OpenCore companies, often they have
Slack communities, Discord communities,

(18:55):
really drilling into those and seeing how vibrant those communities are.
In many cases, you know, the most vibrant ones, you know, DBK is a good example, have users who are
solving each other's problems and, like, having conversations that are
outside of data too. I think that's when
you really have, you know, built a community that is really great. So that's 1 side of the user side. There's also you know, you wanna understand

(19:20):
for open core companies
what kind of traction it's getting among developers,
and you wanna see activity
on no
meaningful activity on GitHub, not just like SaaS or whatever, but, like, engagement and, you know, filing issues
and the number of, you know, developers contributing to the project.
And so once you've identified a company with a decent amount of potential, what is the actual process

(19:43):
of working with that company to determine whether you're going to invest and whether they would like to work with you versus
any of the other investment funds available for actually
securing a round of investment. And it's a very delicate balance of
buying and selling at the same time as you can imagine.
Now at the same time, you wanna

(20:04):
gain more confidence. At some point, you'll reach enough confidence or you're in a full selling mode. But, you know, early on especially, you want to learn as much as possible
while presenting, you know, yourself in a positive light. So that's a very delicate skill to handle that I imagine many others are constantly working on.
In terms of differentiation

(20:26):
that's a great question because the market, especially for early stages,
flooded with capital is the reality.
And
finding ways to differentiate, you know, becomes harder and harder. You know, that's partly why, you know, in some cases, it helps to specialize
just because
by spending so much time in this space
beyond my previous data experience,

(20:47):
just know more people and data and much more engaged with the community So I can, you know, hopefully bring insights
into conversations that an investor who is not focused on the space would not be able to bring into the conversation.
As you're
working with the companies and exploring a given problem domain,
in terms of your overall approach, do you typically start with a list of companies that are interesting and then try to branch out into

(21:16):
the broader problem domain to determine what are the opportunities there, or do you go in the reverse where you start with a problem domain and then look to see what companies are operating in that space?
Think of it goes both ways. Especially in the case
of data,
there are just so many
projects that are brewing inside some organization,

(21:37):
whether it's closed source or open source. And I can't claim that I just, like, you know, magically think of gaps that exist. More often than not, it's seeing
these use cases in companies
that might not have been commercialized
yet. So I'd say that is in terms of going very early in investing, that's the main approach. And then, you know, once, you know, I get excited about some space, I'll inevitably spend more time and, like, understand the market and, you know, learn who the competitors are and to really educate myself on that opportunity.

(22:08):
And 1 of the interesting
balancing acts
too when
identifying
what to commercialize and what to offer as your given product is
how broad or narrow to go. You know, do you go narrow and deep on a particular
pinpoint, or do you try to go broad and orchestrate a platform play

(22:28):
for a given subset of the market? And I'm curious what you have seen as being the
most broadly applicable
strategy
and some of the challenges that founders run into
as they're trying to strike this balancing act? I think that particularly in the earliest stages, which is in terms of investments where I spend my time,

(22:49):
I think the focus is to solve a single problem
deeply.
You know, there's a class of companies that fall into that space.
There will be some companies or projects where it's harder to see how it
branches into a platform opportunity,
where it's able to move up the stack and build applications on top
or, you know, vice versa, move lower in the stack.

(23:12):
I've seen, you know, both
types of, you know, companies. I'd say that
my preferred
path, and it's obviously up for debate,
is that you wanna solve
an important problem
deeply.
And once you are in the space of companies or projects that are solving this type of problem, there are some that

(23:34):
are in the more strategic position to
become a platform
where they're able maybe to build applications
on top of the core project. You know, in some cases, it might be the reverse. Maybe they're able to build down the sector. Then there are also
projects that
solve a particular problem really well,
but, you know, where it's harder to see that platform opportunity.

(23:56):
So I think that the starting point is to solve a problem deeply, but then, you know, whether or not it's a venture backhaul business depends on the ability
to expand into other use cases.
Another interesting thing to dig into, given the current
state of the world, is how you have seen the pandemic play into the viability of different companies and different markets and how that changes

(24:22):
the business strategy for the companies that you're investing in and working with? The pandemic, I think, had many
effects on
the early stage venture investing broadly
and also on
the strategies that some, you know, early stage companies adopt.
In terms of

(24:43):
the impact of early stage venture investing, I think some, you know, sectors have actually become more
competitive
and valuable than ever, including data infrastructure.
You see that in the
public markets and in the very late growth stage
side of investing with Snowflake,
Databricks
reaching pretty astronomical valuations.

(25:05):
In terms of the impact on
strategies that companies adopt,
we sense early on in the pandemic that the bottoms up adoption
plays make more sense in this climate.
And, you know, by that means
word-of-mouth, you know, viral, like, adoption of some product. And the reason was that there is so much uncertainty

(25:27):
across the budgets of many
potential customers
that, particularly early in the pandemic, were not sure
what the budget for data engineering, data science would be. So the idea
is that for early stage startups, that during this period of time,
given the potentially
more cautious buyers,
that they should, you know, work on a bottoms up go to market motion. So that's an example of an impact on

(25:55):
strategies that an early stage company might have adopted
in light of the pandemic.
Turning now to the businesses that you in particular have invested in and are working with, can you give a bit of an overview about the businesses that you have decided are worthy of backing and that you see the potential for success

(26:15):
and maybe discuss some of the aspects of their businesses that were particularly notable and gave you the confidence necessary to decide to invest capital and place a bet on those companies.
Absolutely. I think 2 that are very relevant
for this podcast
are
Coiled and Notable.
So Coiled is a company that is commercializing

(26:38):
Dask
led by the lead maintainer,
Matt Rocklin. And it addresses 1 of the issues that I've seen, you know, and at the same time, opportunity in the Python ecosystem,
which is the scalability side of the equation.
So as I, you know, talked about earlier,
a lot of data scientists in organizations, once they push the limits of, you know, single node compute,

(27:01):
they're often faced with a tough decision of what do we do next? Do we get a beefier machine? Do we switch to Spark? And Dask was a project that, you know, many people
adopted to, you know, bridge that gap. And, you know, it seemed like there was a really interesting opportunity to make
Dask the centerpiece of a Python company
where

(27:22):
a user that hits that point of, like, I need to scale now can actually say, there's this Python native approach to doing this that is, you know, backed by some company. And on the team point,
Matt Rocklin, you know, fits perfectly in the mold of a founder with an unfair advantage of being
the lead maintainer and, like, almost a spiritual leader of the community.

(27:44):
So that was 1. And then in the case of Notable,
it's
really, I'd say, taking Jupyter Notebooks to the next level
with a focus on exactly the parts where Jupyter is not well suited for.
So the focus is to make, you know, collaboration of first class citizen
rather than to focus just on the single player mode.

(28:04):
This means that instead of, you know, collaborating by
sending a Jupyter Notebook file to someone else or doing slower collaboration through Git on Jupyter Notebooks,
You can work
and collaborate in a notebook together where you're, you know, viewing the same results. You're able to interact
with the charts that have been created. And

(28:27):
when you want to collaborate with external stakeholders,
it provides a really easy interface for you to work with even less technical users
that can engage with the different charting capabilities
rather than, you know, screenshotting something and sliping on a slide, which is, you know, what takes place a lot of time today. And another point there is that it's built with, you know, enterprise, like, security in mind. And, again, Jupyter Notebooks were designed for local use and just don't have that type of architecture built in, so Notable is taking a very careful approach there. Again, on the team front, you know, we believe that the team had an unfair advantage. They were led by Michel Euford. They

(29:10):
were
the notebooks team at Netflix, where they built the internal notebooks platform there. And
Michelle is also the thought leader in this space as well, and that has its own advantages of building out a leads list and now that she's starting this company. Once you have invested in a company, what is your involvement from that point forward? I think that particularly at Coursonoa,

(29:34):
we like to be as hands on as the founder would like. In the case of, you know, Coiled,
at the time we invested, it was 1 person about to hire the second person.
Now they're scaled to 10, 15. So early on, you know, lots of conversations, you know, with Matt and investors
about who to hire, you know, where to go next.

(29:55):
And, you know, I think as much as you provide advice, you also
provide candidates for hiring, which is like the most practical use
for the company. So, you know, that's a way that investors also work, you know, pretty closely with portfolio companies.
In terms of your experience of working
as an investor in this space, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process of transitioning into this role and

(30:25):
any interesting aspects of the learning curve of getting involved in venture investing based on your experience as an engineer and a product manager?
So something I heard before in the industry was that you'll have no idea what you're doing for at least your first 6 months, probably longer. That was absolutely true in my case.
I think that because it's such a

(30:46):
unique environment where you need to differentiate yourself
from other investors,
you need to have your really own personal approach to the role. And, you know, I'm
constantly still trying to evolve and see how I can improve
and can pretty definitively say that the comment that people made about the 6 months is very true as you're trying to figure things out. I think another challenge that has

(31:10):
come up even more now is just how brutally competitive
the market is. I don't know if that's surprising, but it's the reality
of things now.
And
as an early stage investor,
that means I want to, you know, go
even earlier if possible,
you know, as we're seeing pretty astronomical,

(31:30):
know, gross stage rounds and even series a rounds,
it's, you know, sharpened my typical focus
on investing in a company at the earliest stage possible. In terms of the broader market,
what are some of the problem areas that you see
as right for opportunity or that you would like to see companies try to tackle that aren't currently being addressed?

(31:51):
I think there is a lot of focus now on SQL,
which is really
interesting
because I think a few years ago, machine learning seemed like all the rage, but it turns out that most organizations
did not have mature enough
data infrastructure to handle that. And, you know, it seems like we're at the point where now there's this very robust environment around SQL, which is still, you know, developing.

(32:15):
Might be lame to say this, but I do think that Python does represent a really interesting opportunity
as, like, the next
set of users to really unlock in an enterprise
setting.
So that's why, you know, I've been very excited and, you know, focused on the different technologies that really enable, you know, data science to uplevel into doing stuff that they wouldn't have been able to do otherwise in a reliable

(32:41):
production grade setting. For engineers who are considering
going out on their own and building a business around a particular problem space, what are some of the pieces of advice that you have for them that will help to make sure that they set off on the right foot rather than, you know, maybe hitting a stumbling block early on because they don't have the right product market fit or they're

(33:04):
not sure how to address the marketing or how to architect the interface
for being approachable and just
the, you know, infinite decision space that's available for them when they have this greenfield and they wanna decide, okay. I'm I'm gonna go into business and solve this problem. Like, what are some of the useful pieces of advice that you might have for them? I think that at the core,

(33:25):
you want to build something that an end user loves.
And
what that means is often, you know, getting out of your comfort zone and just talking to a lot of users and, you know, iterating closely with users.
I would say that, as you mentioned, there's, like, kind of an infinite, like, space of different
configurations
of companies, but I'd you know, be very wary about being intentional about the type of problem you're solving and the type of opportunity that it represents.

(33:52):
I think
that given some problem you're solving,
it's worth thinking about, you know, is this a or do I wanna make this a venture backhaul business or not? That's, like, 1 dimension.
Go to market, do I want to
what's more suited? Is it more of a typical
enterprise top down sales motion or a bottoms up motion?

(34:12):
And making that kind of decision also
has direct impacts
on, you know, the type of founding team that you'd like to have. You'd wanna find someone with complimentary skills. In the case of bottoms up play, maybe, like, another technical leader is makes sense to complement you because you're engaging with a very technical community. If you wanna sell to enterprises,
maybe, you know, that's not the best fit. It could still be, but maybe someone who has had that experience makes more sense. So I think it's really just about

(34:40):
don't wanna scare people off, but, like, being very deliberate about each decision that's made and being, you know, pretty clear eyed about it. So there's, you know, 1 part that's talking to customers, but the other part is just talking to people who have, you know, done this before, whether it's founders, operators
and even investors, although, like, investors probably
offer the least useful advice at the early stages.

(35:02):
For any employees
or engineers who might be working at a startup or considering moving to an early stage venture,
what are some of the pieces of advice that you might have for them as to how best to contribute to competing in the current market and just advice for
considerations that they might have going into that type of opportunity?

(35:24):
I would, again, you know, highlight
being close to the user.
Especially at a small startup where things are moving very quickly, there's not much overhead, and lots of decisions have to be made all the time. I'd argue that an engineer who is very empathetic
towards the end user
will be able to be much more productive than someone who is not. So that would be my

(35:46):
number 1
recommendation
to really engage with the user, not just in terms of, you know, let's say, the energy of the data centers, helping that person figure out some bug issue
or walking them through something, but actually
understanding, you know, who they are, you know, what they do for their jobs, and why they've chosen to do these jobs. So I think that getting that deep empathy

(36:07):
would really take that engineer to the next level.
Are there any other aspects of the work that you're doing as a venture investor or the overall
opportunities
and use cases for Python and the data ecosystem and for building businesses around
or anything tangential to what we've already discussed that you'd like to cover before we close out the show? The 1 thing I'd reiterate is that

(36:31):
it's such an interesting problem space where you have huge adoption of this language
and also have a very extensive ecosystem and a very usable
tool
that
many situations, but not in others, including many enterprise settings. So I just think that reiterating the point that
it's a really exciting space.
We're seeing, like,

(36:52):
you know, data mature at, like,
slower pace than some might have predicted in the past, but it seems like now things are changing. Like, SQL has, like, matured significantly more,
and Python is, you know, taking the next step as well. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And with that, I will move us into the picks. And this week, I got a couple. So I watched the movie The Sleepover recently on Netflix.

(37:21):
Hilarious. Just a lot of good fun, really well delivered. And then I serendipitously
came across a song on Spotify and then found out that there was a music video to go with it called what do you do with a Bernie Sanders that's just playing off of the meme that has refused to die since the inauguration.
So just another few minutes of humor to spend during your day. So with that, I'll pass it to you, Tony. Do you have any picks this week?

(37:45):
Yeah. I will go in a very different direction from movie. A film I that left a lasting impression on me during, you know, this past year and a half where I've watched a lot of movies at home
is Uncut Gems, which is very fast paced, kind of gory, a bit of a crazy film,
totally the opposite of The Sleepover. In terms of a music video, following your lead of an entertaining music video,

(38:08):
I really enjoyed
Stu from Saturday Night Live, which is basically
Pete Davidson's
version of
Sand, the Eminem song, but Christmas themed. If you haven't seen it, highly recommend you check it out even though it's not Christmas anymore. Well, thank you very much for taking the time today to join me and share your experience
working in the venture space with data oriented businesses and particularly

(38:31):
who are leaning on Python for being able to solve some of the open problems in the ecosystem.
Definitely appreciate your perspective
and energy that you spend on that space, and I hope you have a good rest of your day. Thanks so much. It was a pleasure, and you too.
Thank you for listening. Don't forget to check out our other show, the Data Engineering Podcast at data engineering podcast.com

(38:55):
for the latest on modern data management.
And visit the site of pythonpodcast.com
to subscribe to the show, sign up for the mailing list, and read the show notes.
And if you've learned something or tried out a project from the show, then tell us about it. Email host at podcastinit.com
with your story.
To help other people find the show, please leave a review on Itunes and tell your friends and coworkers.
Advertise With Us

Popular Podcasts

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Las Culturistas with Matt Rogers and Bowen Yang

Las Culturistas with Matt Rogers and Bowen Yang

Ding dong! Join your culture consultants, Matt Rogers and Bowen Yang, on an unforgettable journey into the beating heart of CULTURE. Alongside sizzling special guests, they GET INTO the hottest pop-culture moments of the day and the formative cultural experiences that turned them into Culturistas. Produced by the Big Money Players Network and iHeartRadio.

Crime Junkie

Crime Junkie

Does hearing about a true crime case always leave you scouring the internet for the truth behind the story? Dive into your next mystery with Crime Junkie. Every Monday, join your host Ashley Flowers as she unravels all the details of infamous and underreported true crime cases with her best friend Brit Prawat. From cold cases to missing persons and heroes in our community who seek justice, Crime Junkie is your destination for theories and stories you won’t hear anywhere else. Whether you're a seasoned true crime enthusiast or new to the genre, you'll find yourself on the edge of your seat awaiting a new episode every Monday. If you can never get enough true crime... Congratulations, you’ve found your people. Follow to join a community of Crime Junkies! Crime Junkie is presented by audiochuck Media Company.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.