Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:01):
Welcome to another episode of the Mapscaping Podcast. My name is Daniel and
this is a podcast for the geospatial community.
My guest today is Nate Dugan, founder and CTO of Fly.ai, which specializes in processing LiDAR data.
And in this episode, we're going to cover the basics of LiDAR data and its applications,
differences between LiDAR and photogrammetry, the processing chain of LiDAR
(00:24):
data, challenges in classifying point clouds, and a bunch of other things.
But before we get started, it's worth mentioning that I ask all companies that
are featured on the podcast to cover the costs of producing their episodes.
Some say no, and others, like Fly, say yes. And companies like Fly that agree
to contribute are not just making their episode possible, they're making every episode possible.
(00:46):
So thank you very much, Fly. I really appreciate your support.
Hi Nate, welcome to the podcast. Today we're going to be talking a lot about LiDAR.
LiDAR, you have a company called FlyAI, so I'll spell that for people,
F-L-A-I dot AI, and you do a ton of LiDAR processing, and you do something a
(01:07):
little bit different with LiDAR that I want to talk about later on.
But before we get into all that, could you just please introduce yourself to the listeners, please?
Maybe tell us about what your title is at Fly, how you got involved in LiDAR
processing, and we'll go from there. Thank you, Daniel, for having me.
I'm Nets, I'm CTO and founder at Fly. We are a Slovenian-based startup company
(01:30):
that spun out from Flycom Technologies,
which is an aerial data acquisition company that does a lot of remote sensing
and LiDAR scanning jobs in the area.
And actually, we started with optimizing their internal processing flows for LiDAR processing.
And in doing so, we decided to basically spin out and create our own company
(01:54):
that focuses solely on classification and processing of point clouds using AI.
Okay, so I've been talking about LiDAR up until now. Is that a mistake?
You just said point clouds. Do you also process point clouds that might come from photogrammetry?
Yeah, so, okay. So our technology basically is able to process any kind of point
(02:14):
cloud data from photogrammetry to bathymetry and also other sources of point clouds.
Although we do focus mainly on LiDAR point clouds.
So in a sense, they're quite similar, but there are some differences between
different sources of point clouds.
Yes, I'm glad you mentioned that because that's a great lead-in.
(02:35):
I really want to start at the bottom and talk about LiDAR data.
What is it? This is going to seem a little bit mundane for some of the listeners,
but I think it's important to have a foundation we can build the conversation on.
So if I ask you the very simple question of what is LiDAR data,
perhaps we could start there.
We could get a definition of it. but you could explain a few things about it,
(02:56):
and then we'll move off and perhaps talk about some of these different sensors,
some of these different sources of point clouds data and how you're processing it.
Yeah, that sounds cool. The main difference, let's say,
between LiDAR and photogrammetry is that LiDAR is an active sensor that is emitting
its own source of light and then capturing how many photons are bouncing back to the sensor.
(03:19):
And this has a few quite important implications and result of the point cloud.
And those are that we do get the penetration through the canopy as the light
that is emitted from the sensor has a footprint of a few centimeters depending
on the sensor and the elevation from which the sensor is capturing.
(03:43):
And that footprint gets basically scattered or penetrates through the tree canopy
and bounces back also from the ground points.
And in doing so, basically, we are getting points and data points also inside
the tree canopy, which makes it a really good sensor for mapping vegetation or terrain.
(04:03):
And the other thing is, as it's the active sensor, we are always getting the intensity.
With each individual point, which is basically a measure of how many photons
have bounced back from each individual echo.
Just quickly, how is this different from, like you mentioned active sensor a
couple of times there, so and I think a lot of the listeners will understand
(04:25):
that photogrammetry taking an image is not an active sensor, it's a passive sensor.
Could you explain some of the differences about a point cloud
that's derived from an active sensor as opposed
to a passive sensor the difference between active
and passive sensor in general is that active
sensors do have their own source of energy
(04:47):
in this sense laser beam and in photogrammetry when taking images we are basically
relying other sources of light mainly this is our sun and this has implications
that the light conditions are changing during the day,
depending on how high the sun is up, what is the cloud coverage, etc.
(05:11):
And this then also translates to the images.
So that's one difference. So all those radiometric informations compared between
photogrammetry and LiDAR are in LiDAR always, let's say, the same as we know
what is source and what is the power of the source.
This is true for the same sensor with the same setting captured at the same elevation.
(05:34):
And the other thing which is quite important difference between LiDAR and photogrammetry
is also that in photogrammetry basically we are reconstructing 3D surfaces out
of multiple images and therefore some thin objects or let's say wires in power
line mapping can can be hard to detect,
and usually there are not a lot of points on such thin surfaces where on LiDAR
(06:01):
datasets you are capturing wires quite well in power line applications, for example.
I heard you talk about footprints just before. Should I be thinking of LiDAR data as a beam of light,
like a cone of light that is being sent out from a sensor and imagining this
footprint becoming bigger the further I get away from the objects? Yeah, that's correct.
(06:25):
So although we imagine lasers being quite narrow and focused,
in fact, there is still some divergence in each laser beam, resulting in larger
and larger laser footprint farther away from sensors we go.
So if we are looking at, let's say,
(06:45):
scanning at 1,000 meters above ground level, well, depending again a bit about
which sensors we are using, but we can roughly get footprints of about 10 centimeters, 15 centimeters.
Something like that. And maybe you could say a little bit about some of the
different wavelengths that are being used in LiDARs.
So I have a very limited understanding of this, so any help you could give me
(07:07):
would be great. But I know that there are different wavelengths.
Perhaps you could talk a little bit about what they are and the differences between them.
So most commonly, there is near-infrared wavelength that is being used for topographic mapping.
But then there is There's also a lot of applications in bathymetric LiDAR,
(07:27):
so the LiDAR that penetrates the sea surface or the water, and water, unfortunately.
Absorbs quite a lot of red wavelengths, so the red laser beams aren't suitable
for bathymetric LiDARs.
So therefore, they're usually in green laser beams are used.
In terms of these different wavelengths, so if I have a point cloud created
(07:51):
by a red red laser and a green laser.
I'm sure it's more nuanced than this, but if you just bear with me for a while,
does that have any sort of knock-on effects when I think about processing that
data later on, or is it all the same? It's all X, Y, Z plus intensity.
Yes, okay. In terms of X, Y, Z, it should look the same.
Where is the difference? It's the value of the intensity. So the value of intensity
(08:13):
changes depending on a few factors.
One is the The surface that it's hitting is where we are having different sources,
different wavelengths.
Those different wavelengths are absorbed differently by different surfaces.
We are getting back different intensities from different objects.
(08:34):
And if our, let's say, classification algorithm or other processing steps are
heavily relying on intensity to do the classification, this can have an effect
that has to be taken into account.
Okay, and just to clarify, intensity, we're talking about the amount or the
number of photons that are being reflected back into the sensor?
(08:57):
Yeah, that's correct. breakdown. So firstly, thank you very much for that.
I think that's a good sort of general overview of LIDAR. I realized we could
go in a lot of different directions from here, but I want to sort of move on
now that we understand a little bit more about LIDAR itself in terms of how it works.
I thought it'd be interesting to talk a little bit about the different kinds
of senses. So a wee while ago on the podcast, we had someone from the Jedi Project.
(09:18):
So this is a LIDAR in space. In fact, it's attached to the bottom of the International Space Station.
I've talked to people before that were doing exactly what what you were talking
about before, they were using bathymetric LiDAR.
So from aerial platforms, I've talked to people that have been running LiDAR
sensors from drones, handheld devices, and even phones.
In terms of processing LiDAR data and doing the kinds of classifications that
(09:41):
you do on LiDAR data, is there any difference between these different platforms?
Let's say if we go from the smaller scale to the larger scale,
which is the LiDAR in space and the program you mentioned, JEDI.
So the JEDI mission was set up to do vegetation or biomass mapping across the globe.
(10:01):
And as we were speaking before about the divergence of laser beam and laser
footprint, if you put a laser in space, the footprint of each individual laser
beam would be in tens of meters.
And that resulted in quite different point cloud out if even we can call that a point cloud.
(10:22):
So therefore, that data sets from JEDI program wouldn't be directly,
let's say, suitable for our processing algorithms.
But then if we come down closer to the earth, to airborne, to handheld sensors,
there is some differences between different platforms that are carrying the
LiDAR sensor that are translated into dataset, the point cloud itself.
(10:46):
But in terms of processing, we are basically processing all those kinds of datasets.
But what usually differs is what are the use cases that different applications
applications are focused on.
So if you have an airborne sensor on a large plane,
then usually you are doing large nationwide topographic mapping and you're interested
(11:10):
in different categories that if you are having handheld sensors moving through
the forest or through the indoor environment and then trying to classify out trees or furniture,
but there So the main difference is then in the end-use case.
Okay, that makes a lot of sense. One thing we didn't mention was phones.
(11:32):
Do you realize they're probably not producing the volumes of data that you're used to working with?
But just out of curiosity, are you doing any sort of work for people that are
wanting to process LiDAR data coming from phones?
Actually, we did some experimental work. So more and more phones and tablets
are having LiDAR sensors.
(11:53):
All of those sensors are quite short-ranged and a bit noisy.
They're mainly used for mapping or 3D modeling of some small-scale objects.
We did some classification and filtering with those, but that's not the majority of work we do.
So let's talk about some of the work that you do. Maybe you could start by describing
(12:14):
what does the general sort of processing chain look like?
So we capture this data, and then what do you do with it? And the reason why
I guess I asked this question is curiosity.
And also, I was talking to Howard Butler.
I'll put a link to that in the show notes later on. And he was saying 80% of
all of work with LiDAR data is filtering data.
(12:34):
So my guess is there's a fair bit of filtering that happens.
But apart from that, I know very little about the processing chain.
But perhaps you could help me with that. Yeah, sure.
Basically, the LiDAR projects actually start with the specification from a client
that they want to capture certain area with certain point density.
And then what data acquisition companies have to do is basically to plan out
(12:57):
the flying mission, either with plane or with UAV or helicopter,
to meet the client's demand.
And during the capturing, they're doing laser scanning, capturing position with
GNLS system, and also having inertial measurement unit inside the airplane on
the platform to measure all the offsets.
(13:18):
And then when the data is captured, the first thing is to calculate trajectories of the plane.
So basically where the plane or the platform was moving, or more precisely where
the sensor in LiDAR sensor in the platform was at each individual point in time.
After that, generation of point cloud from raw measurements of angles and time
(13:42):
of flight of the beam is calculated and point cloud is geo-referenced.
Then the next step is that we usually cannot cover the whole area with single
strips, so multiple strips are being performed.
So you're basically flying up and down over the region and because the calibration
(14:03):
of the sensor cannot be perfect, there are always some misalignments.
The two flying strips that are having a bit of overlap do not match perfectly one on the other.
So therefore the next step in the processing chain is
usually matching of flight strips so
they are basically put together so that we
(14:26):
do not have little offsets between flying strips and then when we have geo-reference
and match point cloud then the fun part begins as far as i'm concerned where
we start with filtering classification extraction of meaningful information out of the point cloud.
As usually the point cloud is just a lot of points that are representing the surface of the earth.
(14:53):
And also as the laser is going through the atmosphere, there could be hitting
some particles in the air resulting in noisy points in the atmosphere.
So in terms of the classification and filtering, we are basically removing or
labeling all those points in some.
Meaningful semantic categories such as ground, vegetation, building.
(15:18):
High noise points, low noise points.
And then when we have each individual point in the point cloud labeled with
the corresponding category, we can, let's say, come to the last step of processing,
which is then generation of final products or vectorization or rasters.
Here may be digital terrain models, models, footprints of buildings, and similar.
(15:42):
When you're talking about the labeling each point in the point cloud,
it might be a silly question, but are you also grouping them?
Every point has a label. So if I was looking for cars, for example,
so you would label every point as a car, but would you also group those points
together in space and say, all of these points belong to this object, which is a car?
(16:04):
Yeah, so we are actually doing both things. So if in geospatial community,
usually we are using classification terminology for doing this labeling part.
But if I'm speaking more from, let's say, computer vision domain,
those two problems that you're referring to are named semantic segmentation
(16:25):
and instance segmentation.
So in semantic segmentation, we are associating semantic labels with each individual
point in the point cloud.
And in instance segmentation, we are saying all those points belong to individual
instance of a car, tree, or house for that matter.
Also, if we are doing both those things in parallel in the same processing chain,
(16:49):
we also can call that panoptic segmentation.
And it actually turns out that if we are doing both things simultaneously to
decide what are the instances and what are the classifications,
we can improve the quality of results of both of those problems.
You talked about computer vision before there. Is there any comparison to be
(17:09):
made here in a lot of different ways?
So in terms of labeling, so when I think about people labeling,
making a training data set for computer vision processing change,
I think about people drawing lines around,
if we stick with a car, for example, this is a car, this is a car,
showing the car from a lot of different angles and then using this as a training
(17:31):
set to put into some perhaps deep learning model.
Is this the same thing that you're doing when you're classifying a 3D point cloud?
Yeah, so actually there has to be a lot of work put in into preparing training datasets.
And to prepare training dataset is basically to manually go over the dataset
and label each individual point in the corresponding category manually.
(17:55):
Usually this is done in some semi-automatic fashion so that you have some automatic
procedure that is generating the first version of classifications.
And then someone has to go manually over it and edit each individual point to
be correctly classified.
All of this can be quite difficult as one would want to have,
(18:18):
let's say, 100% accuracy in annotations, even for training datasets.
But in fact, this is practically unobtainable or prohibitively expensive.
And also, it's quite difficult to decide what is the correct label of points
on the borders between different categories.
So when the ground morphs into the bridge, for example, where is the cut-off point?
(18:44):
It's quite hard to distinguish. Yeah, yeah.
I hadn't thought of it like that, but yeah, that's a really good point.
It would be hard to find that discrete line, wouldn't it? Yeah.
So again, staying with computer vision just for a second here.
So I know that in Earth observation, in the geospatial world,
they can't just port a computer computer vision library that's been used,
(19:06):
you know, on images from cell phones, you know, directly into the, into geospatial world.
Our data looks different. The frequency of it looks different.
The amount of bands that we use in the layers that we have, you know, it looks different.
It's not, my understanding is anyway, it's not just this one-to-one port,
but they can really build off this, you know, mature library of,
(19:27):
of code that's out there for computer vision.
Is this the same when we think about processing 3D point clouds?
Is there the same sort of mature library of code out there that you can also
use for LiDAR data, for processing LiDAR data?
Yeah. So there are some quite significant differences between how you go about
(19:47):
processing point cloud data sets compared to imagery data sets.
And in fact, there has been quite a lot of more research, focus,
effort done in the image domain.
So therefore, the libraries In that domain, the processing on images,
it's more mature compared to the LiDAR point cloud segment.
(20:12):
I said, because there is a quite larger domain, larger community doing work on it.
Also, there is a ton more of training datasets that are available for doing
research on the imagery datasets.
But nonetheless, there is also quite significant and a lot of good work and
(20:32):
research also done on the point clouds.
And as I mentioned at the beginning of this answer,
there is significant difference between processing LiDAR data compared to imagery
data in terms of latest computer vision algorithms that are based on convolutional neural networks.
(20:52):
And that being is that in imagery data sets, we have structured data,
meaning that data is organized in grid
and we inherit the spatial relationship
basically for free from data structure itself in
other words neighboring pixels in image
are also neighboring pixels or locations
(21:15):
in spatial domain but point clouds
are basically unordered sets of points meaning that we would look at the data
stored on computer hard drive of the point cloud each individual point is listed
in an array and if the records records are one behind the other,
(21:36):
that doesn't guarantee us the spatial relationship between those points.
So therefore we have to also always calculate the spatial relationships for
each individual points and this results in those algorithms being prohibitively expensive.
Or not prohibitively expensive but quite more computationally expensive compared to imagery datasets.
(21:59):
When you put it like that, firstly, I want to say I understand what you're saying.
But you make it sound like there's no spatial indexing happening in the point cloud.
Is this the case? My understanding is that it would be stored in a,
once you process it, you store it in a data format and there will be some sort
of spatial indexing in that.
(22:20):
So, okay. Actually, if you have a raw point cloud, there is not a guarantee
that there is spatial indexing has happened.
And although when you were doing processing, you would usually calculate some,
let's say like KD tree, which is basically a special index and also.
The new format for storing Clidr point clouds, which is Cloud Optimize point
(22:43):
cloud, actually also stores the spatial indexing in the file itself,
so to speed up the lookups.
And this can be quite helpful for visualizations as for processing.
But either way, you have to do it at some point to calculate those indexes.
(23:04):
You don't get them, let's say, for free as you get them in images.
Yeah, honestly, it does make a lot of sense. I just wanted to clarify that for
my own understanding. So you make it sound really hard.
This is what I'm getting from this. This is really hard.
But when I look on your website, I can see that you've obviously solved the
problem for a lot of different use cases.
Can you talk me through some of the point cloud classifiers that you've developed?
(23:28):
Yeah, so we are offering, let's say, three different pre-trained models,
which means that we have generated quite large training data sets and manually
annotated and spent quite a lot of time doing so.
And those pre-trained data sets are then being used for those use cases.
And those pre-trained models are for geospatial domain, so geospatial flynet,
(23:54):
and for mobile mapping domain, and also for forestry domain.
And in scope of each of those domains, each pre-trained model can classify in
a variety of different categories.
In aggregate, currently, I think we are doing like 44 different categories.
Categories, and each client or use case demands different set or combination of those categories.
(24:18):
So then when the real use cases are happening, we are then fine tuning or creating
or selecting the right categories for this individual use case.
And the second thing is what we can also do as we are basing all our algorithms
on machine learning is that if client has their own training data set,
we can also modify or retrain our pre-trained models to be suitable for that
(24:43):
specific use case that wasn't covered beforehand by us.
Okay, yeah, great. So this was one of my questions because I'm looking through
the list here and let's say, let's use tree trunks as an example.
Because right at the start of this episode, we're talking about how there's
different LiDAR wavelengths that are used.
We talked about the different kinds of sensors that are being used and how that
(25:05):
affects the point cloud.
Does this mean that I can just show up with, I've been through my forest and
I've been scanning trees and I can just give you any kind of data from any kind
of LiDAR sensor and you'll be able to detect tree trunks for me?
Yeah, so you can do scanning of your forest from handheld device,
(25:26):
you can do it from UAV, you can do it from airplane.
The only thing you have to consider when doing the acquisition is that the tree trunks are visible.
So meaning, especially for airborne systems, you usually want to do that scans in leaf off season.
So, therefore, the larger percentage of the laser beams do penetrate through
(25:49):
the canopy and bounce back from the tree trunks.
And once you have the data set, the trunks are clearly visible,
we have then a pre-trained model for forestry that will pick out each individual
trunk, do semantic segmentation on that.
And after we have those points, we are doing clustering or instance segmentation,
(26:10):
where where we group all points from the same tree in individual clusters and then do vectorization.
Of those to get length of each individual tree trunk.
And also if the point density is high enough that you can see the points all
over the trunk, we can also calculate the radius of a tree trunk,
(26:32):
which can then be used to calculate biomass or volume of the tree.
And the export or end delivery here is basically a shapefile with each individual
tree and associated attributes such as heights, diameters, volumes, etc.
Wow, that's really amazing.
(26:54):
Again, this might be a stupid question. Is this over a limited geographic area?
Does it matter how much data I show up with?
So in a sense, we can process as much as you can capture Obviously,
there are some limitations in terms of how much storage is available in public
cloud providers, but I think we are not hitting those limits anytime soon as
(27:18):
those data centers are massive.
So therefore, yeah, we can scale up the processing and parallelize it.
We can process large data sets. We are currently processing quite a few of nationwide
LiDAR scans in the region.
You mentioned cloud-optimized point clouds before. If I had a large data set,
(27:42):
would it make sense to give a company like you just access to that URL here
and process it from there? Or do I have to send you a hard drive?
How would I... I guess my question is with these cloud-optimized formats,
do you actually have to have access access to the data?
Does it have to sit on top of the compute or can you stream it into your processing chain?
(28:02):
That's a good question. So actually to ingest data on our site,
we are using different options from uploading the data directly through the web application.
But this may be impractical for larger data sets or projects.
So in that case, we can basically Basically, reader streaming data that is stored
(28:23):
in some block storage in, let's say, Cloud Optimized Point Cloud format.
But one consideration here is that for large projects, the data sizes can get quite high.
And if you are using cloud providers for processing of your data sets,
the ingress and egress fees just for the moving data around can get quite high.
(28:48):
So, therefore, if you have already your data sets in particular cloud provider and particular,
data center or region, it makes sense to move the compute to that region or
center so you don't have to move data in and out of the data center as this
increase costs of the processing and also speed because the data has to be moved around.
(29:10):
Yeah, that makes a lot of sense. Now, I realize we're jumping around a little
bit in the conversation.
Can you tell me a little bit more about these different categories that you
can identify, that you can segment?
So again, I'm looking at lists, I'm seeing ground, vegetation,
buildings, low isolated noise, high isolated noise, water, bridges,
(29:33):
power lines, power towers.
There's a lot of different things here. A lot of it seems to be infrastructure,
which makes a lot of sense.
Can you give me an example of something that's really hard to classify using
LiDAR data? Okay, let me think a bit about that question.
So in a sense, for let's say our AI classification algorithms,
(29:54):
the rule of thumb is that if a human annotator can see and distinguish individual object or category,
then we can also train our classifier to be able to classify it.
So in other words, if you cannot manually or visually see,
(30:15):
the object, it's practically impossible also to create a training data set and
then also to teach an AI or classifier to distinguish the categories.
And also there can be some categories that can be classified differently on different contexts.
So what I'm thinking about now is, okay, we can have parked lorry in a parking
(30:39):
lot and in that sense we can classify it as category other.
But then if this lorry is moving on the road, then actually it's a moving vehicle.
So the context where individual category is can change its definition.
And this part can also be quite tricky.
(31:01):
And it also has to have some additional context associated to it so that we
know how to classify it correctly.
Yeah, thank you very much for clarifying that. I just want to make sure I've
understood something that you said before. So you said the rule of thumb is
if a human can see it and can label it manually, then we can train a model to do it.
(31:21):
I just want to be clear. We're talking about seeing it and labeling it in a
3D point cloud. Is that correct?
Yeah, that's correct. So one thing is that if you cannot label it,
you cannot create a training data set on which to train the AI classifier.
And the second thing is that if it's so hard to distinguish it,
even human annotator cannot distinguish it, then there is not...
(31:44):
Enough information in the data set that we could hope that Classifier would
be able to distinguish it.
When you're classifying objects
in point clouds, do you ever use any other data set to help you out?
If I was classifying aerial imagery or satellite imagery, I might use another data set.
A really basic example might just be using OpenStreetMap, putting a layer of
(32:06):
OpenStreetMap over top of my aerial imagery and say, okay, this is a pool because
in OpenStreetMap I can see there's a pool there. yeah, this is a road,
and using that to help inform the decision.
Do you do the same thing with LiLi data, with 3D point clouds?
So what we are doing is that we are using OpenStreetMap in a sense to filter out outliers.
(32:28):
So for example, we can do classification of power lines and maybe we have then
some false positives that are quite far away from actual power line.
And in that sense, we can help ourselves out with OpenStreetMap have to filter
false positives that are quite far away from actual objects in the area.
But we do not like to rely too much on those external, let's say, data sources,
(32:54):
as in some parts of the world they are not readily available or they are not
up to date, or even if we are building new infrastructure.
So a lot of time when we are processing the data, the data has been captured
because of of some new development in that particular area.
And there isn't up-to-date OpenStreetMap
(33:15):
version because there is just a construction site at the time being.
So therefore, a short answer here is that we could make and we do make use of
OpenStreetMap, for example, sometimes, but we prefer to have a classifier that
is independent of those external data sources.
Yeah, I completely understand that. I've heard of people capturing imagery at the same time as well.
(33:41):
So, you know, an extra data source that's captured at the same time as the LiDAR
was captured previously.
Would this help in any way when you think about classification,
having that extra amount of data, or is it just noise?
Actually, it would help as you can also then, let's say, colorize point clouds,
so basically to project all those RGB values or even near-infrared values,
(34:05):
if we have near-infrared camera,
onto each individual point, and then use that in the classifier algorithm.
Although we are not doing that too many times, as we are building,
let's say, one general pre-trained model that would work on most data sources
or on the most point clouds that we can get hands on.
(34:28):
And most point cloud datasets do not have imagery associated with it.
So therefore, in building our pre-trained models, we are opting out of all other
additional information.
But for some particular use cases, it may help.
Would it make sense to create training data based on synthetic data?
(34:50):
When we think about point clouds?
Yeah, definitely. It would make sense, especially if you are trying to capture some,
let's say, new category and you have really good CAD models for that particular
object, you could basically populate a scene and generate synthetic LiDAR point clouds.
(35:10):
Also, there are quite a few projects that I'm aware of that are out there,
I think some are also open source, to generate synthetic point cloud datasets
that could be also used for training.
But one consideration here is that although this can be quite a good starting
point to generate lots and lots of data, still you are not capturing everything
(35:33):
that is happening in the real acquisition.
So in the end, you will still need some actual data sets, but it can speed up the development.
So during the conversation, I've mentioned a few of these categories that you
can label with your different models that you're using.
We've talked about vehicles, we've talked about buildings, talked about water,
(35:56):
ground, vegetation, those kinds of things.
And my guess is when I look at these different categories that you can classify,
my guess someone showed up with a use case and said, I really want to know about,
to be able to capture power lines or power towers or vehicles,
moving vehicles, linear walls, that kind of thing.
(36:17):
Who is using this? Who is using your services?
Who are big users of LiDAR data and what are they doing with it? Yeah.
So the main use case from where also we are coming from is large nationwide
topographic traffic mapping, where usually mapping agencies of the countries are,
(36:39):
ordering nationwide scans, LiDAR scans, mainly for generation of digital terrain
models where the main thing is to classify the ground points out of the point cloud.
So this is one major use case, large nationwide topographic mapping.
Then the second quite extensive use case is power line mapping.
(37:03):
So in terms of power lines, there are actually two,
let's say, jobs that are being done one is to do
the inspection of the infrastructure itself so to map
each conductor and see if that conductor
is is in a okay state and the second thing is the vegetation management which
is getting more and more important as there have been quite a lot of wildfires
(37:27):
caused because of sparks jumping from from the power line to the vegetation
which was too close to the conductor So therefore, this is also one application.
And then we also come down, okay, forestry, we already covered a bit.
And then there is also the whole virtual reality and metaverse stuff where you
(37:47):
want to basically create digital twins of built environment for various VR applications.
And there you also want to have semantically labeled point clouds to start off
to generate digital twins. I just want to be clear.
So you're talking about power line mapping before and you're saying one of the
goals of using LiDAR for this was to figure out how close is the vegetation
(38:10):
and to manage the vegetation next to power lines, which makes a lot of sense.
But if I heard you correctly, you said another goal is to actually look at the
shape of the conductor and see if it's damaged.
Is that correct? Yeah, so one of the applications is actually that if you know
under what load the power line is, what is the ambient temperature,
(38:33):
you can calculate the sag of the conductor.
And as power line companies also know what the material is used for that conductor
and how old this is, they can basically model the health status of the conductor.
Doctor wow that that's pretty
(38:55):
amazing so you mentioned a lot of things there you said nationwide scanning
so i've recently moved from denmark now i'm
living in new zealand but this is true in denmark and i
know it's also true in in the us where they have the i think it's called the
three dip program you know they basically scan the whole country with lighter
and so digital twins i've heard a lot of people talk about the need for accurate
(39:16):
you know realistic models of the real world to create these digital twins.
LIDAR makes a lot of sense there.
And one thing we talked about right at the start of the episode was space,
space-borne platforms.
I know of at least one company that's going to be launching a space-borne LIDAR soon and.
Do you see this? Are you processing any sort of space-borne LiDAR data sets?
(39:37):
So as of time being, we are not currently processing any space-borne LiDAR,
but we would be more than interested to tackle that problem or challenge.
As I mentioned a bit earlier, there is a bit difference or some considerations
or limitations in space-borne LiDAR in terms of the footprint and what is then
(39:59):
the resulting point cloud because of large footprints. prints.
But what would be really great in that LiDAR community is to get a systematic
source of LiDAR point clouds.
So, for example, what I'm meaning with that is that currently different countries
are planning their data capturing programs, and those are, let's say, every year.
(40:23):
But what is really great in the Earth observation community is that you are
basically basically getting Sentinel-2 data every five days or whatever is the
revisit time and you're getting systematic source of data continuously and in
LiDAR world this is not yet the case,
but it would be really great to have a systematic data source and then also
(40:48):
to have time series of LiDAR point clouds that could be analyzed and also that
would open a whole new potential for new use cases that can be built up on time
series of point cloud datasets.
That is, yeah, that's a really interesting way of looking at it.
I've never thought of it like that for LIDAR, but I know that the Landsat and
(41:12):
the Sentinel programs have had a humongous impact on Earth observation.
And a lot of it is because it's been the systematic source. You knew the orbit
is well known, the sensor is well known, it captures on a regular basis.
And it's open and freely available. Yeah, and it's meant that people have been
able to use it basically as a platform to build other applications and products.
(41:36):
And you're right, we don't have that for LiDAR.
Yeah, that's quite a, let's say, large opportunity for someone to fill that
gap, to start generating systematic LiDAR.
And we're really keen to see the development in this area.
There is more and more LiDAR data captured every day.
(41:57):
But with having a systematic source, this would be quite awesome.
Yeah, it really would be.
So now we're on to that section of the podcast where we're looking out towards the future.
When you think about the future of LiDAR, I think you probably just mentioned
one of the things that you would like to see, the systematic...
Capture of LiDAR, perhaps preferably from a space-borne platform,
(42:17):
but maybe from another platform, who knows? So that's one thing.
But when you think beyond that, do you think of just LiDAR capture as being
just more data at a higher resolution and a higher frequency?
Or do you see something else happening on the horizon?
So I think the two largest trends that are now playing out in the domain of
(42:38):
LiDAR, the first one is that the sensors are getting more and more more affordable,
resulting in more data being captured, which is really great because a few years
ago it was quite, quite expensive to do data acquisition and therefore only
large acquisition missions were carried out.
(42:58):
But now with affordable sensors and also lightweight sensors that could be mounted on UAVs,
almost Almost everyone with mapping capabilities can afford to have a new AVA
equipped with a LiDAR sensor, which is fantastic.
And also, as we mentioned, there are more and more handheld devices that are
(43:20):
also capturing the LiDAR data sets.
So the first thing is that data acquisition is getting cheaper because of more
affordable sensors and lightweight sensors. And the second thing is the automatization
of point cloud processing.
So meaning AI, machine learning tools that are able to automatically classify large volumes of data.
(43:43):
And those two drivers are basically
driving the prices or affordability of point cloud projects down.
And in that making space for new opportunities and use cases.
Yeah yeah no i would you're the
expert but i in my uninformed opinion i
(44:04):
would definitely agree with that it's going to be kind it's gonna
be pretty interesting to see where it ends up and i can see too
this this push for these realistic versions of
the world when we think about ar vr digital twins i have a hard time imagining
doing it without lidar but you know i'm not an expert in this field this is
just what what i what i see from where i'm sitting I think probably now is a
(44:27):
really good time to round off the conversation and I want to do that by saying thank you very much.
This has been great. I've learned a lot about LIDAR. If people want to reach
out to you and continue this conversation or learn about what you're doing at FLY, where can they go?
Can I put a link to your website in the show notes?
Is it okay if I link to your LinkedIn profile?
Yeah, sure. They can go to our webpage, which is flai.ai.
(44:52):
They can also find me or reach out at LinkedIn.
It's less active on Twitter, but also there. Great. Thank you again for your time.
Really, really appreciate it. Daniel, thank you for having me and it was great
discussing the LiDAR with you.
(45:30):
Music.