All Episodes

April 12, 2024 42 mins

AI Autocomplete for QGIS

Brendan Ashworth the CTO and co-founder of https://buntinglabs.com/ focuses on integrating AI with QGIS, and today on the podcast we are talking about Autocomplete for vectorization.

Along the way Brendan will share with us why Bunting Labs chose to build this on top of QGIS, the Challenges in Map Digitization, what the development process was like and how this is different from tools like Segment Anything ( from meta ) 

Here's what we discussed:

  1. Introduction to Bunting Labs: Get to know more about Brendan and Bunting Labs, whose mission revolves around enhancing QGIS with AI, especially focusing on automating vectorization processes.

  2. AI Autocomplete for Vectorization: We explored the AI autocomplete feature developed by Bunting Labs that simplifies the vectorization of maps in QGIS, streamlining the digitization process for better efficiency.

  3. Brendan’s Background and Motivation: Brendan shared his journey from a software engineer to a pivotal player in the geospatial sector, spurred by a project that showcased the potential of merging geospatial data with machine learning.

  4. Why Choose QGIS?: Discover why Bunting Labs opted for QGIS over other GIS platforms, with an emphasis on its open-source nature and vibrant community ecosystem.

  5. Challenges in Map Digitization: Our conversation covered the technical challenges involved in developing AI capable of accurately understanding and digitizing maps.

  6. Iterative Development and Learning: Brendan highlighted the evolutionary process of their AI model, which has significantly improved from its early versions.

  7. AI vs. Segment Anything: Brendan explained how their AI autocomplete tool differs from existing solutions like Segment Anything, particularly in handling specific digitizing challenges.

  8. The Future of AI in Geospatial Data Analysis: We discussed potential future applications of AI in geospatial data, including automatic georeferencing and metadata extraction.

  9. Privacy Considerations: We also touched on the importance of privacy in the development and deployment of AI technologies in geospatial data analysis.

  10. Changing the Geospatial Landscape: Brendan shared his vision for using geospatial data not just to map the current world but to plan and improve future landscapes.

Sponsored by https://www.scribblemaps.com/

Recommended Listening

https://mapscaping.com/podcast/the-business-of-web-maps/

https://mapscaping.com/podcast/the-business-of-qgis-development/

https://mapscaping.com/podcast/qgis-offline-and-in-the-field/

https://mapscaping.com/podcast/computer-vision-and-geoai/

 

https://quickmaptools.com/ - MapTools to save your time processing GIS data

 

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:32):
Sure. Thanks for having me on the podcast by the way.
So I'm Brendan. I'm the CTO at Bunting Labs and that means
that I have been building a AI auto-complete for QGIS for
the past year. So I wasn't originally a GIS geospatial person.
I actually got my start kind of in open source software.

(00:55):
So I originally got involved with the Node.js project which is
in open source platform for running JavaScript on servers
normally and that's kind of where I initially
learned to engine or software and that really gave me my
spark in computer science and it was only later that I went to

(01:17):
MIT and ended up studying physics and computer science there
where I got more into machine learning and I think you can
kind of see that Bunting Labs is kind of at the intersection of
many of these different interests of mine.
So kind of my background in machine learning and specifically
I studied Bayesian inference at MIT kind of led me to have this

(01:42):
geospatial moment towards the later end of my college career.
And that's actually when I was working on a consulting project
with a real estate developer and we were more or less
summarizing different neighborhoods in terms of their characteristics
with statistics and with neural networks and it's more or less

(02:04):
at that moment where I realized that geospatial data is
extremely powerful and I was kind of confused as to why other people
weren't using geospatial data more and that really set me on this path
to discover geospatial data and kind of the intersection
between geospatial data and machine learning and that's kind of
where Bunting Labs came from.
I guess the big un-unanswered question for me is why

(02:27):
QGIS then? Why did you decide to build something on top of
the platform that is QGIS?
My familiarity has always really been with open source software.
So when I think of software that I can contribute to and
software that I can build an ecosystem around or contribute
to the ecosystem in I think of GitHub and I think of repositories
that everyone can kind of see, learn from, contribute to and build

(02:50):
around and so QGIS is in many ways closer to my comfort zone than
many of the other GIS software that are available out there.
It's extremely easy to get started with so we basically built
the first version of this plugin in about two weeks and I could
do that by learning from all of the other open source plugins

(03:11):
that are available and so QGIS is not just this one app that
you download to your computer, it's really an ecosystem that everyone is
contributing software to and enjoying as a result and so I think
that's why I first got started with the QGIS plugin.
And so my next question is a two-part question.
Explain to myself and into the listness of this podcast what

(03:33):
auto-complete for digitizing vectors means for you and QGIS and why
that was a problem you wanted to solve.
So auto-complete for factorizing in QGIS or another way to kind
of describe it is an auto-complete for tracing maps is a addition
to the GIS pen tool and so if you are presented with raster

(03:58):
imagery of some sort so if that's a satellite image of an area
if that's a rendered PDF of some sort of construction or architecture
asset somewhere in the world and you need to convert that into a
vector file with a projection and some sort of like accuracy

(04:19):
you would use the pen tool to digitize that into its vector
representation and in many conversations with GIS professionals
we found that they essentially participated in this workflow
extremely frequently and so they would be presented with oftentimes
a PDF that doesn't have any kind of projection with it.

(04:40):
They would need to do reference that PDF to oftentimes sub beta
resolution and that's literally locating it in aerial imagery
or maybe on an open stream app base map and adding control points
until the raster layer perfectly lines up with its location in the real
world and then they would get out their pen tool and digitize

(05:01):
out lines, polygons, maybe even points that represent semantic
shapes from that map and then they use that and then add metadata
to those shapes and so it's not just lines in your local
state plane these are buried utility lines and so you would add

(05:22):
metadata of the diameter of that pipe or maybe these are
literally painted lines on a highway and you are adding metadata
that says these were last painted on 2014 August 2014
and they will need to be painted again in August 2024
and this kind of workflow where you would take rasters and then

(05:45):
you would take a meaningful digital file format equivalent of
them we saw all the time and so given my background we started
imagining what this workflow would look like with AI and this is
more or less the product of that.
You did a great job answering both questions with one answer
that was brilliant. So given your skill set, your experience

(06:09):
and your interests, if I was you I would be I would look at this
convert every line that I can see in that rendered image to a
you know a vector output instead of trying to like trace lines
individually is that a thought that crossed your mind when
you were thinking about solving this problem.
Yeah and I think it's really funny because my conception of how to

(06:30):
solve this map digitization problem is totally different now
than it was a year ago when I first started looking at this.
So when Michael and I decided that map digitization was a problem that we
would tackle we basically went to the whiteboard and I threw up my design, my
attempted solution for what digitization with AI would look like

(06:54):
and I said it's probably really easy to convert an entire map into its
vector representation. In fact there's an entire academic field dedicated to solving it.
If you go online and Google vectorization especially of images
there's a lot of published research as to actually accomplish that task
and I said this is a solved problem shortly and I can build this in two weeks.

(07:19):
I said that probably 13 or 14 months ago so I totally take that back
and the first six months of me building this was actually very similar to
your suggested solution. I did a literature search
and found the best performing machine learning models for converting an
image into its vector representation and there are some really high quality

(07:44):
research available online for doing that. If you take pretty much any
image we're talking especially like logos images of the real world
and convert it into its vector format to your eye it looks fantastic.
It is very perfectly recreating the entire image as in SVG for example.

(08:05):
But when we went to kind of our original design partners and showed them this technology
we realized that it kind of missed the mark. It's like op-re-oriented when you see a map
it's kind of difficult to determine which of the assets in that map are
important to have the vector representation and so we realized that if we wanted

(08:26):
to productionize this we actually had to build a semantic understanding
of all of the features in these maps and that all of a sudden was a much more
difficult problem because that's akin to machine reading of maps and
understanding and when we were looking at this a year ago that seemed impossible.
And even worse when you have extremely important assets that you want a

(08:51):
high quality vector representation of you don't actually trust the
output of a machine learning model on its first go. If you were to hand me a
geotip and I handed you back a shape file you would actually load it into your
GIS software choice and review it such that it's perfect such that it's two
your standards and realizing that took about six months after six or seven

(09:16):
months of working on that approach I actually went on a walk with Michael and
I essentially came to the conclusion that I was not on a trajectory of success.
If I interpolated my progress out I would not be successful in one or two
years and so I essentially had to go back to the drawing board and it was only

(09:37):
then with inspiration kind of from these more recent generative AI models like
Chat GBT did I consider an AI autocomplete this was never originally going to be a
plug-in that you run in your desktop GIS software but we realized that that was
actually the modality that people wanted they wanted to be able to have access
to this and immediately review its output. I think to what you're getting it is

(10:01):
they wanted to be in charge they wanted to have just increase or make me more
efficient in the work that are more really doing in that way I will trust the
output more. Could you talk to me a little bit about how this is different from
segment anything for example because this is kind of like you give it an image
and it segments everything in the image and people seem to love that and it seems

(10:23):
to work really really well and this reminds me of your original approach like
give me an image and I will vectorize the whole thing could you just explain
to me in the list is please what why these two things are different and why
I think that would be really helpful. So segment anything is extremely powerful
but it's powerful in a few specific characteristics the first one is that

(10:46):
it's grounded in text descriptions of what you're segmenting so if you take a
very large satellite image for example I think the best use case for segment
anything is really satellite and aerial and you can textually describe the
geometry that you are extracting then I think segment anything and I believe
segment anything geospatial the package that actually implements it specifically

(11:11):
for geospatial software is probably the best thing that you have and so if you
load drone imagery that you just took and ask it to segment out the lake boundary
that's pretty much the best thing you can get I think where our AI auto
complete is preferable to segment anything is especially when you are
digitizing geometries that you have a semantic understanding of but are difficult

(11:36):
to textually explain or you are digitizing lines that are difficult to
semantically extract from that map and so for example we often see a lot of
extremely low quality maybe there's a lot of JPEG artifacts on the map maybe
the resolution is pretty low it was originally rendered to be really low or even

(11:57):
those artifacts from the original scan those are kind of maps where segment
anything struggles it's much more difficult to get segment anything to extract
a line of a pressure sewer such that the line is a dashed line that has
PS intermittently interrupting the line style whereas on the other hand our

(12:19):
AI auto complete because it looks at basically the line that you have already
started it excels at just completing it. Okay the line that you have already
started in the same way auto complete works for fun using chat gbt for
example some people describe it as a fancy auto complete where it says okay
your question was this the next logical token is this in the same way we see

(12:40):
it auto complete in Google docs for example it looks like you're the next
logical word is here so when you say auto complete you mean I take my pen
or I start tracing the line and it jumps ahead of me as is looks like you're
going on this line here and follows it along am I correct. That's exactly
right so the way our AI auto complete works is it's a drop in replacement

(13:03):
for the panel and so if you activate our plug-in in qgis and begin digitizing
some geometry let's say for our example it was a utility line on a
raster map once you begin digitizing that line by drawing two segments a
small bit of that map will be sent to our inference server and it will auto
complete 50 of the next vertices of that line and these vertices are output

(13:28):
by a neural network that we've specifically trained for this job and so
that neural network is literally looking at the pixels that you have
already drawn so as to predict which of the map is actually semantically this
same feature and which of those pixels it should choose as continuing

(13:49):
vertices and I've realized that this problem is actually much more complex than
I originally imagined when you and I are talking about digitizing maps
it's extremely broad and big and it can be difficult to imagine what those
maps actually are but the magic in AI auto complete is not that it works on

(14:10):
certain maps it's that it works on your map and so when someone uploads
into qgis a map that our AI has never seen and yet is able to
generalize based on the hundreds of maps that we've trained it on that
I think is the really impressive thing.
So the I think you're the example you have on your website is amazing and one

(14:33):
why I think it's amazing is that you're tracing a dash line so just to
describe this for the list is that there's a video of yeah screen video I guess
of you tracing or someone tracing a dash line in and QJs in which is
incredible and you can see the auto complete working in you know being
ahead of the pen but I think the really incredible thing is that adjacent
to that line there is an identical line crossing over that line is

(14:57):
another identical line and yet the auto complete knows that I do want
to follow this particular one of those identical lines but to me to
the layperson this seems like an incredibly hard problem to solve.
So I think early on I underestimated that difficulty and I'm actually
very fortunate that my background kind of let me push through all of the

(15:20):
complexities that came with solving this problem and so I think if you
kind of follow our journey on social media it might seem as though we
built this in a couple of weeks and all of a sudden there's AI and QJs
but really that's a total simplification of kind of this journey I've had
in understanding what map digitization really meant both at a

(15:43):
semantic level for the people that do it professionally but also at a
technical level and how you could teach a computer to truly learn the
semantic features that these people are working with and can actually
auto complete ahead of them and so to kind of dive into what is
surprising about this I feel like I actually have a lot of you

(16:05):
know as someone who's literally looked at probably a thousand different
maps over the course of building this AI autocomplete and I've
actually run nearly 2000 machine learning experiments to get to
this current AI autocomplete and experiment one I promise you to
not perform well at all and if you are actually one of our users and you

(16:28):
download QJs download our plugin and try it out you are actually
running model 1,854 and the 1,853 models that came before that were
bad in many ways but the main important thing is that they weren't
able to generalize into the map that you're looking at I've realized

(16:50):
that lines have so many semantic meaning that allows you to
disambiguate between two lines say when they intersect and
encoding that into the model has been pretty much the most important
task that I've completed. So it hits off to you for continuing I
think if I got no what do you say 800 or what 1,800 times I would

(17:13):
be tempted to give up but you didn't. And finally you got a yes
but my guess is also that I guess you were like during that process you
could see that you were getting better and better and better and further
closer and closer I should say to the goal so my guess is you had
encouragement along the way it wasn't like hard no hard no hard no right
that many times and then yes so but anyway that is incredible

(17:35):
well done well done had talked to you but when I think about this so
digitizing lines is very impressive but it's a step in the in the
journey right right. Georeferencing is another part of that great
way. Now I've got my lines not to not to throw any cold water not
to rain on your parade or anything but now I've got my lines you've
done a great job of that I've got my lines now they need to land
somewhere specific on the real world in order to be useful to me and

(17:59):
then I need to extract some metadata about those as well is
the any part of this the model that you built today the experience
that you've had the learnings long the way that overlap with
it with maybe the next step of of Georeferencing that's a great
so I've been working simultaneously on both of these models so
the AI autocomplete for vectorization is something that has

(18:24):
kind of crested this threshold of usefulness for most of its
time it was not useful at all and I also have a machine
learning model on my computer that can georeference maps as
builds documents from architecture and construction firms you
name it automatically but it hasn't yet pressed it this threshold

(18:48):
where it can actually save someone time and it's interesting
because these two problems while they're both grounded in maps
they're actually totally different because the AI autocomplete is
really about understanding the semantic nature of maps and
the features that are depicted on them whereas georeferencing is
actually a search problem when you're presented with a map that

(19:13):
you don't know where it came from especially this is extremely
common for consultants if you are basically asked to do
reference this map to a sub meter accuracy you often don't know
actually where it is at all and so you embark on this interesting
search whether you're using Google Maps you're using
Erion satellite imagery maybe open stream map and all of these

(19:38):
other kind of data sources that can help you locate where a
particular map is and the possibilities for that are like
astronomical you know we're talking a chess game number of
possible game positions that's comparable to the number of
possible ways a map could be georeference to sub meter accuracy in

(20:00):
the world right so it's actually extremely difficult especially
when the map only has a little bit of information so if the map is
just a picture of a building and two cross streets I can guarantee
you there are a thousand roads in the United States with those
street names and so what people actually do is they go through
and page through all of these possible permutations of where

(20:24):
this map could be and then they line up exactly where this building is
and so if you imagine an AI model for georeferencing it has to do
all of that plus more and so you're synthesizing all of these
different datasets as to you know all of the roads in the world
not just the US obviously all of the satellite imagery or even

(20:46):
high quality high resolution aerial imagery that you can use to
georeference in exact building outline or maybe even the curb on
a road these are challenges that you know I've had while hand
digitizing hand to referencing maps and building in AI model to
do that same thing is surprisingly difficult but I think

(21:08):
it's something that you know we'll be able to do in hopefully
three months it's interesting to hear you say surprisingly
difficult because no it doesn't surprise me at all especially
after what you just said which makes a lot of sense but the good
uses we have a bunch of filters normally on a map you have a name
you might have street names you might have metadata you might

(21:30):
have some coordinates in there you might have something like
this and we can filter it down to okay I know I'm looking
in this particular town or around this address just as an example
and I guess to not all other datasets a great your base maps
for digitizing against so if you've got a you know PDF as
built of some building looking at really grainy satellite data

(21:55):
is not going to be super helpful it's not going to get you we
want to go maybe open street map would be better or another
building layer would be better where we're talking about discrete
objects as opposed to pixels and so we can do things to filter
that down at least that that's my sort of experience is that
also your experience or do you can you put some other words
around that as someone who's tried to make this model and

(22:18):
is also continuously improving it there's even more complexity
than you could imagine we've done a few projects on like
book to referencing maps and one of the really interesting
parts of this challenge is things change over time and so
we've gotten basically the equivalent of a truck load of

(22:40):
of PDF maps and I've been asked to do your reference these
and the complexity of that is like highway names change
over time right road names change over time buildings disappear
and appear and so when you're being at rivers like change
direction as they flow and so when you're being asked to
locate a map the map is actually a description of how it was

(23:02):
at a certain place in time and so that's another complexity
that you're also dealing with well yeah it's a really good point
to talk about that trust me I hadn't thought about it either
but you also said that you're hoping to have something working in
three months time which just seems insane but what do your
expectations to have working three months time what what do

(23:24):
you think it is that that will be working and how will it work
I think it's really about this threshold of human usability
for an AI autocomplete we realized that the bar was
that the autocompleted vertices are easy to delete in case
they're wrong but easy to continue from in case that they're
right so we find that a lot of our users are actually okay
with mistakes because they can instantly repair them designing

(23:49):
that into our qgs plugin has actually been a great
boon for its usability and we're kind of looking for the same
thing with this year referencing model it's actually of
no cost to you to try and gerruffin something acknowledge
even if you get an error message or acknowledge that the
gerruffin is not as accurate as you wanted it and then go in

(24:10):
there and gerruffin set yourself and so you can imagine a
gerruffin sing button in your desktop GIS software where
you load a raster and perhaps it doesn't have a CRS perhaps
the ground control points are wrong and it automatically
locates its location in the real world and aligns like
rotates that image aligns it and gives you ground control

(24:31):
points to ground it in you know in its actual location
and that experiences what we're essentially trying to
emulate with an a i gerruffinser wow so you would
that that's a really interesting inside that they're already
doing that which sounds incredible already doing that
would save a ton of time for people and be a massive
value add for them right and because they come into this

(24:55):
task with the expected cost of gerruffin sing because
gerruffin sing something accurately can take upwards
20 minutes if I've had maps that go longer but let's let's say
20 minutes as a good number because you know that it can take
that long the cost of actually waiting 10 seconds to have an
AI model try it first that's actually pretty good and

(25:18):
so that's kind of what I have in mind as I go about building
this model now we jump back to use it to your auto
complete for a vectorization now that it that it's working
you've crossed that threshold of usability that you keep
it does that mean that as the person's using it and creating
these results are they in a way creating labels and adding
to the model as they do it yet yes this verticie was

(25:41):
right yes this verticie was no that's wrong over here
are they improving the model over time so not by default
so our plugin doesn't collect any unnecessary telemetry
so we obviously know when you've requested a like
an auto complete but we don't track whether or not you
keep that result we don't track whether or not you delete

(26:02):
the resulting completion and we don't track whether
or not you cut that completion at a particular point all
these are kind of potential outcomes that could happen
in the plugin and so we don't track any telemetry as to those
outcomes and so we don't automatically add any
maps to our training data set we do occasionally have
conversations with our users where they request that we

(26:25):
add their map storage training to set and on occasion
we will go and have them hand digitized we have a GIS
staff that helps us with that but by default that's not happening
is that in the interests of privacy or is it because
that data would not add any you know particular value
to to the auto complete model in the background it is

(26:46):
definitely in the interest of privacy yeah let's say
if okay so what you're saying here is that you
would be better for the model we could make a better model if we
collected that data about the interest of privacy
not I realize earlier on you said that these are
two different problems to solve the auto complete
is a you know a semantic problem and the the georeferencing is a
search problem assuming that you know we didn't have to think

(27:09):
about privacy issues here if you also collected data for
that the georeferencing that people were doing could you
also improve that over time I think you could I
think as with most ML models that are kind of deployed
on the internet seeing how it's actually being used in the real
world is valuable information but at least for us we found

(27:30):
that we can collect really high quality data without
having to go into our users data and I think that's a great
great that we've kind of found ourselves in in a great
position to be in yeah absolutely so you sound incredibly
talented and I'm an optimist so let's say in the next few
hours when you've solved both these problems perfectly

(27:53):
and you're ready to move on to one of the next steps in the
process which is adding metadata now that you've solved
the the auto complete problem you've got a model that
works 100% of the time you've got a georeferencing model that
works 100% of the time how will you start work on collecting
metadata what will that look like what what will be the steps
involved in the potential output the metadata is really

(28:14):
interesting because I think it brings the semantic
understanding of the map that someone is digitizing to a
totally new level because when you see people extracting all
the relevant metadata for a particular project really it's
not at the point where you're ingesting metadata it's

(28:35):
not about a particular map it's how the map fits into your
overall project and the goal that you're trying to accomplish
and so if you are doing construction in an area and you want to
have a complete understanding of all the buried utility lines
such that in the in the case that you need to actually dig
around these and what's called daylight them which is when

(28:56):
you dig exactly around them such that you can see them with
your own eyes all of that metadata that would help you
accomplish that day lighting process is actually your goal and
the metadata extraction is just how you get there and so I
think this is really where the application of multi-modal
models can really accelerate this process it's because

(29:19):
when we combine the intelligence that we can generate
about a map along with the visual and textual data
understanding that an LLM can provide it's only with
that larger context that metadata extraction begins to
make sense and so if you go in to gpt4 with vision or some of

(29:42):
the other multi-modal models that are available right now
and ask it to describe a particular feature in the image
it is totally incapable of doing that and that's really not
surprising this data is not in these models training data sets
and so it's not surprising that by default you can't get
this kind of behavior from these models that being said

(30:04):
it's something that I know is possible and so I'm really
excited to be a part of that future of making it.
Yeah I guess what I'm trying to understand is like
what kinds of attributes would you be able to collect?
What kinds of metadata?
Because for me it would make sense that maybe you could
calculate the distance of that line just as an example.
Maybe you could look at the color of the line that you have

(30:26):
just auto completed and say well I know that that matches
to this thing over here in the legend of my map
if that's things and then there might be some data associated
with that it might say water pipe, sewage pipe, I don't know
gas pipe, that kind of thing that would be valuable data to have in there.
Maybe I've digitized other lines maybe it could say
well this pipe is so close to the other pipe

(30:48):
maybe you could start building up this database based on the metadata
that you're collecting about each object based on its physical characteristics
based on whatever else is in the map.
That is the way I would think of it but I'm sure you have other ideas.
So in terms of metadata that a user could probably extract
from these maps automatically a really simple example is data on a legend

(31:12):
but that's not really the exciting applications.
Data on a legend is like pretty easy to extract
and so it's not really the level where you're saving a GIS analyst
or a GIS technician a lot of time.
It's really about metadata that's hidden in these maps
and so I'm going to continue with this subsurface utility example
even though it's kind of misleading.

(31:33):
If you were to daylight barred utility lines you would do a more
it would be a more complicated process to actually do that
and you wouldn't do that with just QLD raster images, raster maps
but hypothetically really important metadata to know
if you were daylighting a particular barred utility line is actually the depth of that line
and also the trait of the soil that it's buried in.

(31:58):
These are kind of important things for the construction crews to know
if they were to be delicately digging around a gas line
such that it wouldn't have a dangerous utility strike
and so that's kind of a good example in the utility space.
We also have our users outside of architecture engineering construction.
Another good example is actually mining and geology

(32:20):
and so if you are a geologist and you are dealing with kind of a older geologic map
that describes the deposits in an area.
The metadata associated with those deposits can be pretty complex
especially when the legend associated with that.

(32:41):
It's not even about describing a solid color region of that map
but rather the matching the exact stylistic look of that polygon
to a tiny little bit in the region.
That kind of says this is a sediment deposit of some type
and so it's metadata like that that we would want to extract automatically

(33:04):
and actually embed into shape files or geopackages such that these projects can
be accomplished more quickly.
Well, it looks like you have got your hands full for the next little while.
What more generally are you surprised at the lack of AI being used in QDS for example
or GIS tools and especially when we think of desktop tools?

(33:26):
For me, this is one of the few examples that I've seen out there in the world
where it's being embedded into the actual tool itself, into the GIS tool itself
but maybe you've seen more, maybe you've seen less.
I don't know but I'd really like to hear your thoughts about that.
As an ML oriented person, I think there's a big gap between what is possible

(33:47):
in desktop GIS software and what is already happening.
So when people see large language models like ChatGPT,
GPD for a lot of the other major developments that have been happening
especially in terms of video generation, audio processing,
it's clear that pretty much all complex problems

(34:08):
will eventually be solved by artificial intelligence until it relates to scarcity.
I think unless you're talking about material scarcity that is like literally the amount of food
that's available on Earth, a lot of these problems can and will be solved by AI and ML.
And I think it's only because we're at the beginning of this renaissance in machine learning

(34:32):
that we see attention to it but everyone's talking about it and nobody is implementing it.
I think that's the best way to kind of imagine how this renaissance is happening right now.
Obviously open AI and these larger AI research companies are driving most of the innovation
in terms of where AI is going.
Meta being actually an extremely great example who created segment anything

(34:55):
and is releasing a lot of this groundbreaking research publicly.
But I think most of the opportunity that exists in machine learning today
is actually in building domain specific models and embedding them into professionals workflows.
So I think GIS is a really great example of this.
You can't take an LLM even if it's memorized the entirety of the QGIS documentation

(35:23):
plus all mentions of GIS online ever.
That LLM will often not immediately accelerate a professionals workflow.
But once you understand that these large models are doing much more than generate text,
they're actually generating semantic understanding of what you are asking it to do.
And then completing that task as a result.

(35:46):
Once you see that semantic understanding is actually what's being accomplished here,
you can embed semantic understanding into all programs.
And that program can understand your overall task just as much as you do.
And I think we will see that trend more generally within GIS, within CAD,
within all of these kind of desktop softwares.

(36:07):
And it's not just about text.complete.
It's really about augmentation.
If you weren't working on this problem of auto-complete with digitization of geo-referencing,
automatic extraction of metadata,
and thinking again about working within and on top of the QGIS platform
and solving problems for geospatial professionals
that were using this as the main tool, what problem would you work on?

(36:31):
That's a great question.
So pretty much all of my work so far has been around mapping the built world,
or even the natural world.
But an interesting opportunity for the geospatial world to move towards
is less about mapping its current state,
but more about imagining what's possible.

(36:54):
And there's a couple of examples of startups actually doing this,
where you take your conception of how the world is now
and pretty much synthesize a way that the world could look in the future
and evaluate whether or not that version is better.
There are some really great advancements in how people are doing this

(37:15):
to, for example, make the world a greener place,
because geospatial software can literally find the optimal place to put a wind farm
or a solar farm.
Once you consider all the complexity that's associated with that.
But I think it goes much deeper than just real estate development.

(37:36):
It's actually about evaluating ways on a map
that the world is a bad place and changing that in the future.
And that's really magical.
A fun example of that is actually when you map out the street trees in a neighborhood.
This is especially true in the United States,
where you have kind of large roads and not much shade.

(38:00):
They don't really encourage pedestrian-friendly neighborhoods.
But you can actually use machine learning and statistics on geospatial data
to find the best spot to plant large trees
and create more shade, create a better spot to walk around.
And basically improve the nearby residents' lives.
So I think if I wasn't building AI autocomplete for this geomorphencing,

(38:25):
vectorizing and metadata extraction workflow,
I would be working on changing what's actually in the map.
That was a great answer.
I think I'm going to need a few minutes to walk around the house after this interview
and think about that.
I appreciate it. You've really given me food for thought.
This is also probably a great time to wrap up the conversation.
And thank you very much for your time.
It's much appreciated. And thank you very much for your work.

(38:47):
I think it's fascinating. I think it's really interesting.
And above all, I think it's going to be incredibly helpful to a lot of people.
So if people want to check out what you're doing,
where is the best place to they can go?
Can they reach out to you? Is there a website?
What, where can we send them?
Yeah, thank you, Daniel.
If you're interested in seeing what we're up to,
you can go to our website, which is bunting labs, B-U-N-T,

(39:10):
I-N-G, labs.com. And you can also follow us on Twitter at the same handle.
Yeah, thank you so much for the opportunity to share what we're up to.
No problem. Any time. I appreciate your work.
And I wish you all the best in the future.
Cheers.
Thanks.
Advertise With Us

Popular Podcasts

1. Stuff You Should Know
2. Start Here

2. Start Here

A straightforward look at the day's top news in 20 minutes. Powered by ABC News. Hosted by Brad Mielke.

3. Dateline NBC

3. Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2024 iHeartMedia, Inc.