Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:02):
We are taking you behind the scenesof the National Institute
for Data Scienceand Artificial Intelligence.
With unprecedented access
to the scientistspushing boundaries and shaping our future.
This show will take youto the cutting edge and beyond.
And whether you're an expert yourselfor just science
(00:22):
curious, this is the show for you.
Welcome to the Turing Podcast.
Hello, everyone, and welcome
back to the Turing podcast, where todaywe'll be talking about Project Bluebird,
which was actually first featured
on the very first episodeof the podcast back in 2020.
I'm your host at Cal State,and I'm here with Dr.
Nick Pepper, Dr.
(00:43):
George De'ath, and Dr.
Mark Thomas, who respectively workacross the island during Institute,
the University of Exeter, and Nuts,who are the UK's
leadingprovider of air traffic control services.
So Nick, George andMark, welcome to the podcast.
AI Hello.
Hi. So before we
go into talking about Project Bluebird,I'd like to know more about all of you.
(01:07):
So who are you and how did you come to bedoing what you're doing today?
Let's start with George.
Hi, I'm George.
I'm currently a research fellowin Computer science at the University
of Exeter.
And how I got here wasI did appear to be a vision
and then decided that wasn't for meand moved over to the world of
(01:29):
optimization, specifically expensiveoptimization, and some of the hardest
and most complicated optimization problemsare the calibration of digital twins.
And so my last few post-docshave been about trying to figure out
to learn the parameters of digital twinsbased on real world data.
So the performance of the Twinaccurately reflects the data.
So I've been doing that for my largely
based UXand sort of outside of the paste up.
(01:51):
My general research areais fairly closely aligned.
Well, I do a lot of work in Bayesian optimization, which is a method
of surrogatemodeling for doing out migration.
Fantastic. Nick, what about you?
What's your backgroundand how did you come to be
doing what you're doing? Sure.
So my background is in probabilisticmethods and uncertainty quantification,
(02:13):
mainly for applications in aerospace.
I took my Ph.D.
at the Departmentof Aeronautics in Imperial College.
After that, I
did an internship for a year,the NASA Langley Research Center,
and then off the back of that jointproject, Bluebird at Cherry.
Fantastic.
And Mark,how did you come to be working at Nuts?
(02:35):
So I guess my journeyis a fairly long one.
I started off doing a medical degreein working as a doctor
and trying to study transferred over.
I did a Ph.D. in theoretical physics,right?
I worked as a postdocin particle physics. Wow.
You don't like to let yourself off easy.
Then I try and
jump around here.
(02:58):
I then joined a startup
looking at real time rail monitoring.
So a predictive maintenancefor the rail industry.
Then joined Nats about five years ago.
I joined the sort of precursor projectto the Bluebird project
they were working onnow, helped a little bit with the bid
and I've been the industry digital twinlead on project Group since it started.
(03:21):
Fantastic.
Yes, and maybe just for the audiencelistening of benefit,
maybe the first thingbefore we jump into Project Bluebird
is to think about, well, two things.
Maybe First, what is not And like,how did this sort of collaboration
come about?
But then also, I guess, Digital Twins.
We've done a few episode on Digital Twinsin the podcast before, but
(03:43):
remind the audiencewhat's a digital twin and what's.
Yeah,we're not interested in such a thing.
Okay, so what is that?
And that's is the UKair traffic controller.
If an aircraft takes offand flies over the skies in the UK,
they in order to keep them safe,make sure that they don't crash.
(04:05):
They have to follow what they're toldto do by the age of controllers matches.
The company is formally, formally setfor national air traffic services.
Now it's just Nats
which
employ and train them provide services.
So Nats has been
looking at
(04:28):
I work in the R&D department,research department.
We've looked at
new tools, new developments
that we can do in order to improve safety,improve efficiency, and that's it.
That's been an ongoing thing over the last20 plus years.
And what we found was that we were getting
(04:49):
to the limit of we've been working onincremental steps.
You know, what's the next step?
Plus the little steps, little steps
which is a sensible thingto do in a safety critical industry.
You know, you can't it's hard to makebig changes in the safety industry.
So you have to make sure you're safeall the time.
So we've always worked on little steps,little steps, little extra tooling.
(05:10):
That's worked quite well,
But you kind of get a little bit stuckbecause you need to see the whole big
picturebefore you can do these little steps.
So the idea with Project Bluebird was to
kind of ignore the need to get it
into operationsfor the next five years and just say,
What's the art of the possiblein automation in air traffic control?
(05:31):
So we we said, let's see
if we can automate as much as we canof the job of the air traffic controller.
Let's see how far we can get.
That opens up a lot of avenues becauseto do that we need to get the data ready.
So a lot of the work that we've donehas been taking operational logging data.
(05:52):
That's really meant operational purposestransform that into a way
that's useful to do research on it,I said, and build the digital twin.
So the digital twin digital twinis a bit of a buzzword that she used quite
a lot these days, but so a lot of peoplehave slightly different definitions,
but really it's it's a high fidelitysimulator.
(06:13):
Simulatorshave been around for a long time,
but now what we've got differentis that we have a lot of data coming in.
So instead of simulating a specific,you know, simulating a rocket,
you can simulate the rocket
and inform it from databeing collected from the rocket.
So you can have a lot more realisticsimulations and you can simulate it
in different situations.
So in terms of air traffic control,we can build a simulator of the skies
(06:36):
over the UK, but instead ofjust having a generic simulator,
you can tune it,you know, give an aircraft performance
given certain days
and you could run the simulatoras as what happened on Christmas Day.
You know, the weather on Christmas Day.Right.
The storms that happened
just this weekend were a lot of airportsand things closed.
(06:56):
You know, we'll be looking at simulatingthat in a few weeks time so that we can
look at I'm jumping ahead to the agents,
but we can look at performancesand things in certain situations.
Nice.
So, yeah, it's interesting what you saidat the start, which is that, of course,
air traffic control being a very sort ofsafety critical industry.
We don't want planes fallingout of the sky
(07:18):
or hitting each otheror anything like that.
It's probably understandablethat the methods and processes in place
currently are sort of, you know, rightlysort of resistant to any fast change.
And that what you've done hereis you're doing sort of research projects
that might someday actually, and I supposeyou will tell me at some point
we've got to on this but inform policyand you know,
(07:42):
bring some of these methods of come fromthe research projects into the real world.
But you started offwith a, with a research project.
So yeah, I mean I guess we haven't explained that
if you mind me explainingProject Bluebird, so let's go for it.
Yeah. And then I can step backand answer that question.
Yeah.
So Project Bluebird, the sort of overarching goal of Project
(08:04):
Bluebird is to try and build a air agency,sort of a digital air traffic controller
to control the sector of air spacein real world Chatham Oaks trials.
So really is built an agent that can to
essentially be a digitalair traffic controller.
With that we've got the main threads of
(08:25):
that is the digital twinwhich we need to train it to test it.
The agents themselves.
And then we've got a thread on
safe, safe, trustworthy, explainable.
I sort of a small step backand ask those questions.
We've got kind of a fourthunofficial thread, which is data,
(08:46):
because that turned out to be quitea big part of the work in the first year.
Is getting more dataready to feed into the digital twin, I say
so to make it as realistic as possible,to make it as realistic as possible.
Yeah, I mean, actually using the datathey come is sort of semi-structured log
files and things with a lot of extra bitsthat been added in by by the system.
(09:07):
So filtering those out, getting theminto ways that you can be used to.
Right. Was a lot of work.
I think
any good research projectcomes across data issues at some point.
So it doesn't surprise me thatespecially that's a very much a real world
problem that we're trying to solve here.
So yes, it's summaries we've gotthen we've got the the simulator
(09:28):
with this digital twin of of the air spacetrying to be as realistic as possible.
And then, of course, yeah.
If we want the air to be able to ever dothe job of the air traffic controller
that's the prerequisite to, you know,giving it a go at this simulator.
Yeah.
So, so then I guess I wouldthen step back to the original question.
We're looking at building a
(09:49):
AI agent that can do the job of an airtraffic controller
getting that, you know,to replace air traffic controllers
is going to be decades awaybecause the the regulation doesn't exist.
You know, how to validate it.
How do we make sure it's safein all situations?
You kind of de scale airtraffic controllers who need to take over
(10:10):
if something goes wrong.So there's a lot of issues there.
So the actual main agent
is not likely to belet out into the wild for a long time.
But what we can do is use the
use as sort of a universal enginewithin tooling to the sort of thing
that we can do in the sort of fiveor ten year time frame are advisory
(10:31):
tools, tools that appear on the screen,that air assistance assistance.
Essentially those are the thingsthat we can actually validate.
We know we've we've got more ofthe processes in place to validate those.
We already have certain types of tools,which
is a bit of a stretch to call them all.
But we do have advisory tools and things.
We can extend those.
And if you've got something that can doall at the whole job,
(10:53):
you can have a suite of toolsthat comes out of that
that are interconnectedrather than individual tools.
Very interesting. Yes.
So so this is obviously the challenge,the Nats had.
So where did the training next come inand how is this sort of
how did this projectcome about in the first place?
Or maybe the better question iswhat are your your two involvements?
(11:14):
And Joe,
So I guess when the project came about,
because no one, a group of peopleI think is really qualified
to tackle this problem.
Yeah, very multifaceted. Yeah, exactly.
And the good thing about the chairingis it brings together
different groups of academics
that have different skillsets in order to tackle big problems,
which is particularly useful for companieslike Nats who want to work with a broad
(11:36):
range of skilled academics without tryingto just go to one university.
So the combination of the chairingand extra Phoebe from Cambridge
and that says come togetherto be this giant team,
I think there's about 45, 46 of us nowacross everything working on this project.
So it's a huge number.
Yeah, we have quite a lot of expertiseextra in digital twins.
(11:57):
And so the calibration of them to make itgood makes sense.
And indeed the development of
some of the methodologies
that are used withinsome of the agents that we're developing.
And so depending as I'm sharing this,working on different agents to Exeter,
but they're also complementary,
I say,so maybe we can talk about that and say,
(12:18):
is there some element of competitioninvolved in like who can develop
the best agent or, Well, George is winning.
So yeah, like I, I was trying to be,
yeah, this really beat me to it.
So yeah, the moment I am developingthe agent that is performing the best,
I think that's due to the approachthat I'm following.
(12:40):
So say sojust you, just to walk back for a second.
So when we say the agent that's performingbest, so is the aid that's been dealt
by you to do thistask of interacting and doing well,
the air traffic controller simulation is this somehow sort of gamified?
Is there a way of
(13:01):
getting
points that you can evaluatewho's done best?
Yeah.
So if I step back slightlyand take over from Mark
and explain the role of an airtraffic controller, go ahead.
I think that will help in talkingabout the automation aspects.
So the air traffic controllersjob, as Mark rightly
says, is to safely get at an aircraftthrough the air sector
(13:22):
and air traffic controllerstypically work in pairs.
So there is one air traffic controller
who issues the commands to the pilotto navigate them through the sky.
And then there's a second onewho has more of a planning role,
and their role is to basically design
the boundary, decidethe boundary conditions of the sector.
So this is precisely what locationthe aircraft are going to exit
and at what flight levelis going to exit that
(13:42):
to try and deconflict aircraft crossingbetween one sector and another.
And the role that we're primarilylooking at automating
in at least the first phase of the projectis the tactical role.
So this is morethey are designing the route
the aircraft is going to takethrough the sector.
And so we've been initially focusingon sort of three types of agents.
(14:03):
So there's an optimization based agent,there's a reinforcement
learning based agent,and then there's a rules based agent.
And the one I'm working on inparticular is an optimization agent.
I say, Sothese are all different approaches
to tackling the same problem, which is,you know, choosing the best route.
Yes, that's right. Yeah.
So it's building it's building a plan
(14:25):
for the air traffic agent.
And so you command the pilot how to flythrough the sky, how to say, Right, Yeah.
And so these would be
when you say so the virtual pilotor the real pilot eventually.
This is presumably all of all ofthe planes that the air traffic controller
is interacting with or is it. Yes. Yes.
(14:46):
So the idea is to have complete,complete control of the sector.
Yeah. Just to stress is definitelyin the simulator at the moment.
And that's probablywhy it's going to stay low
just to take the pressure off me slightly.
But yes, I think so.
Actually.
I think another bit of terminology therewhich I've heard
come up from talkingto people in this project for sector.
Sector has a specific meaning. Right.
(15:07):
And yes, yes.
So in the in the UK there's the UK air,which is the flight information
reach the flight information reasonI say that pretty much data what I mean
which basically is the UKregion of space that Nats controls
and this is describedinto small basically
groups of polygons and these polygonscome together to form a sector.
(15:29):
There are multiple sectors
on different geographic parts of the UKthat all connect to gather.
And so an air traffic controllersits on a specific radio frequency
and only interacts with the pilotsthat fly through that sector.
So that's really
what the sectors are.
That's our sector.
I tend to saythere are about 100, 100 miles across.
I mean, it varies heavily.
It's really depending on the workload.
(15:51):
So you get one a much bigger overthe North Sea, you get quite small one.
Yeah.
London But you think of as a placedamage over 100 miles.
That's roughly right.
So I'd say this simulatoris this effectively simulating
one sector is a you know X unit of time.
So what we have in the inside
(16:12):
of a main branch that we have atthe moment is it has two sectors in it.
We wanted to have two sectors in earlybecause it allows us to start looking
early on on human and agent interfacingand on agent agent interfacing.
So we've put in two sectors.
One is called London Middle Sector,which is about 20
to 30000 feet over Londonand then London upper sector,
(16:35):
which is 30,000,60,000 feet over London Sea.
So we've got those twoand then we've we've looked
at that with the agentcontrolling one of those or both of those.
We've also run it
with a human controlling oneand the agent controlling the other one.
So two sectors is what we have atthe moment.
We'd like to startlooking at more multisector problems
(16:56):
so that you haveyou know, it's an interesting question.
If you've got ten agents from ten sectors
or one agent controlling ten sectors, youthen more things you can look at there.
So we've got development
branches at the momentthat have 31 sectors in them.
We're hoping to get that mentioned to meby main by the summer.
So the idea is, you know, we start off,you start off
(17:16):
small, try and do a reasonably good joband then expand out from there.
And in terms doing a reasonably good job,how are the agents doing
and is that big differences in the performance of different methods
and how are they doingcompared to humans so far?
That's a very good question.
So the metric thatwe've been focusing on first,
(17:38):
and I guess this is because it'sboth the easiest to measure
and the one everyone really cares aboutthe most is safety.
So in terms of actually controllingair traffic safely, so to put it simply,
it's basically not pointing aircrafteach other at aircraft as each other,
as well as not letting themget too close in the sky.
So they've got to stay more than fivemiles apart laterally and 1000
(17:59):
feet apart vertically.
So there's basically these hockeypucks through the sky we have to navigate
to point them near each other.
So the agent on the wholeis doing a very good job
in terms of safety,where clusters mostly are mostly safe,
which is not actually good enough.
But we are at this stage in the projectwe're quite pleased with most. Yes.
(18:20):
Yeah. Yeah.
Make sense. Yeah. Mostly as well.
While it sounds disappointingis actually quite a way I'm quite proud
of that rating of safety could takein a lot of work to get to this point.
Yeah. Yeah.
So yeah the Yeah mostly is Yeah.
We're very,we're very happy with that. Yeah.
So I mean it might be a good chanceto mention the summer trials.
(18:40):
We did say summer 2023.
We, we, well it started to in
23 we finally got to a stagewhere we'd sorted out the data, we'd built
data pipelines, we had a big dataset,we built a digital four working digital
twin with Belt MRI to visualizewhat was happening in the digital twin.
We'd got the first sort of working agentsthat weren't an embarrassment,
(19:04):
and so
we had a number of weeks of trialsover the summer where we then
had these two sectorsthat London, Middle London, upper sectors
controlled by different combinationsof humans and agents.
And we assessed the agents using the same
marking schemes that we assess trainee air
traffic controllers in the collegein terms of training college.
(19:26):
But we trained people from the UKas well as abroad
and we used to saymarketing scheme is that.
And so, you know,our main focus is safety.
You can't do it safely.
There's no pointdoing anything else on the safety.
We were very pleased it would have failed.
Yeah,but it actually didn't fail by that much.
Yeah, we did well on other thingsthat we hadn't concentrated on yet.
(19:49):
You know, it was
this summer was really sort of a proofof concept, full stack, you know, offer.
It's full stack.
Look, make sure everything's workingas we expected.
Yeah.
And it did very well and George donea lot of work already
and improving on and improving on the
only weaknesses that were highlighted.
Yeah. Then yes.
So one interesting aspect of thisI can imagine is that
(20:12):
well firstly I think it only makes senseto focus on safety first and then I guess
I guess optimization would beit would come next.
But in terms of, you know,you mentioned that there are different
sort of machine learning approachesand maybe a rule based approach.
So with the rules based approach,you know, you can imagine having full
(20:33):
understanding of how such system makesthe decisions that it's going to make.
And when it's a human airtraffic controller, they can just ask them
why did they do so things?
So as you get to the point where
the machine learning agentsstart to perform pretty well,
how how or how would you go about deciding
(20:53):
whether the decisionsthat are making are going to stay good?
You know, how can you explainhow that working, if at all?
Yeah, so this is a thread of workthat we're currently engaging with,
which is I think, very difficult as no oneelse has really done it properly.
And so the three,the three main agents have different
they get different levels of abilityto explain what they're doing.
(21:17):
So like you highlightthe rules based agent is built by just
composing human written rules, whichbasically looks like a load of statements.
Is it safe to climb to this level?
I need to exit at this level, yes.
Then climb to it
so that can be interrogated and say,why did you climb to this level?
By the rule? Told me to.
That makes sense. Yeah.
Whereas an optimization basedagent is sort of in the middle
(21:39):
on the scale of explainabilitywhere we have a set of objectives,
we sort of measure in some way
a desirable characteristicof the trajectory the aircraft flies.
So we can say this is a good solution,but that isn't really sufficient
to explain why this particular solutionwas chain was chosen.
So what we did in the summer was we plotted the patterns of the aircraft
(22:00):
onto the AI, onto the visualization,and the first thing we got was why?
The first question we asked was why?
The first question we were askedis why this plan?
So we thought we were being very clever.
We're showing themwhat the agent's going to do,
and the first thing that they said is why,which in hindsight was very obvious.
And this is actually quitea hard question, what can be assessed?
(22:21):
Yes. Yeah.
So other expect other research projectsin the past in other fields have dealt
with explainability by just sayingthis is the most likely prediction
and people learn to trust it.
But that's not enough for, say,safety critical.
We have to be able to explainwhy the system made this decision.
And that's quitethat's a really hard question.
So in optimization, we're currentlyinvestigating ways and in fact
(22:44):
developing ways to sort of move towardsbeing able to explain the decisions.
So there's some ideas involvingcounterfactual explanations.
So sharing solutionswhich are slightly worse
than the one we've gone for and say, look,we didn't choose this one or this one.
The one we've shownyou is the best we can do.
And I think that all we hope well,will give us some kind of Explainability.
(23:04):
Yeah.
And the third kind of agentwe have, the reinforcement
learning based agentis largely a black box, unfortunately.
Right.
And this is obviously not goodfor Explainability.
There is some initial research out there
to explain it better,say, of reinforcement learning agents,
which corresponds to what peoplehave been doing and complete division.
So in computer vision they've AIpeople may have seen some pictures
(23:27):
where essentially they highlightthe pixels that led to the decision
that the computer vision system made.
So if you're trying to classify a dogor a cat, they may highlight
different features of the dogor cat to show that decision.
So this is just been developed last yearfor reinforcement
learning agents, which were at leastgive us a slight idea of saying
our aircraft made this decisionbecause this part of the input space.
(23:51):
So it looked to another aircraftthat might come into contact with
it just to highlight that aircraft.
So there is a slight levelof explainability,
but this is currentlythe state of the art.
So there's a long way to go.
Yeah, it sounds like our Alta method.
So like explainability in in machinemethods
is pretty new, is in its infancy.
(24:11):
Yes. Yes.
I'm imagining there probably aren'tany other precedents for you to look at in
terms of like if you were to taketry to take an example from another safety
critical industrywhere ML approaches had been applied.
I'm guessingit just doesn't exist at the moment.
Or if it does, it's not talked about thatthere are, like I said,
(24:32):
it's mostly
there's not that much explainabilitythey build the trust of the system.
Yeah.
So you just learn to rely and believein the system rather than tell the system.
Yes, rather than the systemactually being provably good.
Yeah.
And frankly, why it came to a decision,which is sort of a big difference.
Explainability I feel like peopleI haven't tackled that much
(24:53):
in the literature or to the extentthat we need to be able to demonstrate
to people at the very least.
And I want to bring Nick in just becausewe haven't had much more chance.
So. Yeah, yeah, yeah. Sorry.
What what is, whathave you been working on on this project
and what's been your focusas the person in the room?
So I work more on the digital twin size
(25:16):
and I suppose my connectionwith what George works on
is that all of these agentsthat his discuss involve
some aspects of looking at his in time.
So they make a plan.
They want to roll out the plan and checkis it going to be safe
with aircrafts, conflicts with each other?
Where would aircraft be in 1015 minutes?
(25:37):
So in the field, this is called trajectoryprediction or TPI, for sure.
I saymy research has mostly been looking at
existing team efforts which are based on
essentially physical principles,
equations of motionthat comes from flight mechanics
and assumes pilot behavior
(25:59):
and tries to take these modelswhich are deterministic,
say for a given
seven pitch,you will always get the same output.
So tell us aboutis there any system ever bursts
that unfortunately aren't capableof fully describing
the kinds of variationsthat we see in the data that we've got?
(26:19):
So trying to find a way to kind of marry
the two together,the data from the existing models
suggest something that's more accuratethat will help show which is agent
and the other agents, becausetheir predictions will be more accuracy.
And that also has usefulness
when we come to simulationsto try and test the agent
(26:41):
because they have this backgroundof probabilistic methods
we're looking at trying to generateessentially
synthetic profiles of aircraftfor the simulator that look plausible
and could conceivablybe seen by aircrafts in real life.
And it is is the reason for that.
You mentioned the data earlier.
(27:02):
Is it because getting hold of these real,real world data is just a challenge
or there's not enough of it?So I'm not sure I'm
quite answering your question,but I don't see any question.
And so, you know, what we builtinitially was a digital twin.
What we need is a probabilisticdigital twin.
So to Nick's work,been building the engine
sort of the hard bit of buildinga probabilistic digital twin.
(27:25):
And the reason we need that nextsort of already said it but
an agent
especially is Ican relate to reinforcement learning agent
if the aircraft is the same thingevery single time,
then it will learn its actionsbased on the fact that the aircraft
always says the same thing.
But you tell meI have to climb for 20000 to 30000 feet
(27:47):
and you chose 100 aircraft,the same aircraft, all of them Ryanair,
whichever you a at distribution,because they depend on the weather
on the day, the amount of fuel,the pilot, the procedures on the day.
So you actually don't know how quicklya climb on or descend or various
other behaviors.
You don't knowexactly what it's going to do.
(28:07):
So we can't have a digital twinthat's that's deterministic and
does the same every time, which is the bartomatoes base of data aircraft.
I think model is the industry standard.
It's a physics based one.
If you if we use thatand that says what we used initially
is to say maybe
time means that the agents, they do well,but it won't generalize.
(28:29):
What we need to dois be able to say, right,
we'll run the scenario and every timeit'll have different variations.
But we want that to befrom a distribution of real days.
Or is it is it tootoo much to simulate weather conditions
and hope and or is that.
No, we're also gettingin the weather conditions.
So yeah, yeah.
(28:50):
So we have a we have access to data
and for the yes of relative to that
we have for the for the projectsso we can sit together
and use the Met Office to assess the kindof adverse fidelity to the digital twin.
As. Yeah.
Researchers are going
(29:13):
we've
got I mean what we're trying to do isthe whole thing is kind of a stack.
You've got the data, yeah, you've gota digital twin, you've got the agent.
If you waituntil you've done the digital twin
perfectly, the agents can't stopworking till you're five.
So, you know, first of all,we built a non probabilistic digital twin
whilst Nick is workingon the theoretical underpinnings
(29:35):
to bring in the probabilistic aspects,then we've added in simplified
probabilistic aspects is quite easyto make things probabilistic with a
plus on distribution things,but it's not very realistic.
So we've kind of stepped up the levelof complexity in the level of realism.
So every six monthswe increase the capacity.
CSA Economic assumption,the question, no, you know, definitely.
(29:57):
Well, I guess as that level of complexityis increased to do
the sort of broad categories of methodsthat you've been looking at,
have they stayed the same? Always?
Have some gotten out of the windowjust as you've increased the realism?
I think they've stayed the same.
Really?
Yeah.
We we we knew where we wantedto be in year five roughly.
So when we started working on a marginally
(30:20):
more simpler problem, we still took that,took that into account.
So we didn't try to think we knewwould absolutely fail by year two.
Or they say that the rules based system
that will be sort of an upperlimit of its performance.
Yeah.
So well, like I said, it's veryexplainable, but it's really hard to write
sufficiently complex rulesto describe the behavior
(30:43):
that the codes the air traffic controllersasked pilots to do.
So there will probably bean upper ceiling,
but at the same time, it'sgoing to be a very nice baseline
because this is reallywhat people have done in the past.
It's right. Really nice rulesbased systems.
Right? Right.
Yeah.
And it's back toif we weren't rolling this thing out as a,
you know, a full replacementfor air traffic controllers,
which again, is notwhat we're aiming to do in any short term
(31:06):
a rules based system.
If there were limited subsets of thingsthat could be automated, a rules
based system, we can probably, you know,get in in the 5 to 10 year period.
So it's worth looking at the limitsof a rule based system
to understand whether or not it'salways going to be useless
or whether it's actually can doquite a lot of what we need to.
Yeah, yeah.
(31:26):
It makes sense that you would only gofor the the potentially a very useful
machine learning options where you losesome of explainability only passed
the point where the rules basedjust can't get any better.
Yeah I'm really you knowkind of theoretically I'm, I'm, I'm
talking about something herewhere it should be George talking in.
(31:46):
I might say things are wrong, but,
you know theoretically the reinforcementlearning agents should be able to cope
with more complicated situationsthan optimization based agent.
And so that's why we're looking at that.
It's more black box,but it theoretically should
shouldcope with very complicated situation.
(32:07):
So if people are doing thingslike agents for Starcraft,
you know, multiplayer games,I thought as I'm aware, no one does with
rules based agents, they use reinforcementlearning agents for exactly that reason.
It's not clear
whether or not air traffic control gets tothat level of complications or not.
So we you know,we don't know the best way to ceiling
(32:30):
most of our catchesis that it won't because in Starcraft
you've got somebody actively workingagainst you and trying to get around
to your tactics,which that's interesting doesn't happen in
because as I understand it, sort of the
the DeepMind eyes that have beaten like goand the other games like Starcraft
which are based on reinforcement
(32:51):
learning like is alwaysthat race is an adversarial game.
Whereas what you're doing hereis I hesitate to say easier,
but as you said, it'snot someone's trying to make you lose.
Say so.
Maybe it is it is easier where we well, again, I'm not the expert here, but
(33:11):
my understandingis that we looked at a lot of different
the sort of our whilea lot of things are else
applied to and
this is probably more difficult
than most of them but not Starcraft
not those things are the exampleswhich are more complicated.
Yeah, I mean, it depends how you definecomplicated, but in things like chess,
(33:33):
you know, it's discrete, it's quitewell defined in air traffic control.
The large
one of the main challenges on projecthas been the fact
that the domain is very complicated.
You know, it's somethingthat has been built up for eight years
in a sort of piecemeal engineeringfashion, a new system, a new rule here.
The rules are differentall over the place.
There's lots of sortof seemingly arbitrary rules. So
(33:57):
there's a lot of difficulty.
It's not a nice, cleanmathematical problem,
which you kind of get with,you know, chess or with go.
Yeah, yeah.
It's very complex, very real world,I think.
Yeah, But George might want to corrector expand on that comment on it.
No, I think I agree
with everything you saidand I think one of the real challenges
in this project has been knowingwhat good our traffic controllers.
(34:20):
So in Starcraft you get a score that thesome feedback feedback
you're getting all the time.
Yeah.
Where reinforcement learning excels issystems where you continue to make moves
and you get a reward backor the rewards delayed,
but you can fudge itsomehow to approximate own reward.
That's how that the later methods work.
But the problem in air traffic controlis we know what safe is.
(34:42):
We can measure safety,
but there are other other formsof more hand-waving metrics
which which are usedor such as having being orderly in the way
you present your trafficto the next sector.
That's not really measurable. It's easy.
And the statementwe've got from lots of different air
traffic controllerswhen we're talking to them is
(35:02):
if you get to air traffic controllersin a room
and ask them for an answer,you'll get three answers back
because there's no best wayto solve a situation.
There are good ways,all of which are perfectly fine,
and therefore, being able to measure bestand even quantify
best is really challenging,which is where the summer sim
that we did was really helpful
because we spent a year talking to airtraffic controllers,
(35:24):
talking to other people,and that's trying to learn
how air traffic control worksand how we can turn them
more written down rules into somethingthat we can quantify and measure.
And so in the summer we had, in my case,
a set of objectives that I developedbased on talking to people
and that increasedsome behavior in the agent
(35:45):
in the way it controls traffic and talking
talking to the air traffic controllers,they go, That's good, that's not good.
And from that we can actuallywe've got a feedback system
which sort of lets us reinterpretwhat we believe they are doing
because it's really a sort of a jointcollaboration, more of an iterative, it's
a little bit more sort of ad hoc and basedon their professional experience,
(36:08):
that's it's a metric.So you can just count.
Yes, I think the problemwould be quite a bit easier
if someone could just write outa mathematical formulation of this.
Is the objectivefor good at traffic control?
Yeah, I think then we'd find itwould become something more like Starcraft
where we got a feedbackand I would be in my head.
Well, my thoughts were oncewe passed the safety threshold
(36:29):
and you were satisfied by that metricthat the next metric would be,
well, have as many planes in the skyas possible to.
But I guess maybe,maybe where you come into problems
there is if, if these agentscan be working alongside human
air traffic controllers and it'sgoing to suggest strategies that they find
confusing or counterintuitivebased on their professional experience,
(36:51):
maybe that's not good. Yep.
So there's an aspect of airtraffic control
which is essentially be politeto your neighbors.
Yeah. So he's got to leave it alone.
If you've got a human working in the nextsector, you to you expect them to
to be polite to you in the waythey present air traffic to you
in the same wayyou've got to be polite to them in the way
you try and not make their lives difficultin the way you present traffic to them.
(37:12):
Technically,you could do it however you want,
but that's obviously backfired on them.
Will get you showed off in the break room.
But yes,so that's really some of these more less
tangible aspects of controlthat we have to try and implement.
I wonder if you could tell us a bit.
We've got a train coming up, a UK,which is this year 2024, 19 20th of March.
(37:35):
You talked a bit about some ofthe successes you've had so far with the
with the digital twinsimulator and the agents themselves.
What can people expect from them?
You may have a UKor any other upcoming demos,
so alongside the
digital twin development,we've also developed a gamified version
(37:58):
of air traffic controlto demonstrate to the public what some
or to let people experiencea very stressful situation
in that gamified versionof air traffic control.
So a sector that's just shaped like an Xand they have to try and flat fly aircraft
through it, which although it looks quitesimple, could be surprisingly stressful
and say there's a scoreboardand so people can try and play fly
(38:20):
as an air traffic controller for the dayand try and appreciate what
the guys and gals do at Nats in termsof the brilliant control work they do.
So we should have,
I think, one or two positionsthat people could come and play the game
as well as a few videos of agents
controlling traffic in a simulator.
So I think it should bewe run the same sort of thing,
(38:40):
the British Science Festival last year
and the public really,really enjoyed getting involved
and asking us lots of questionsabout air traffic control.
Brilliant, brilliant.
But I point out that none of the publicmanaged to beat
George's agent on that.
So we've got betterthan the general untrained public.
Yeah. Yeah.
But we haven't yetgot back to control lots.
(39:03):
Yeah, that's,that's something that's I think.
Yeah.
Yeah.
How is it.
Is it an intuitive game to play.
Was it, do you really feel like.
Yeah.
I suppose the air traffic controllersthemselves see it as very similar
to what they see in a real system.
We've tried to simplify itas much as possible,
so it capturesthe essence of air traffic control.
So you have a sector.
(39:24):
That's what they have.
And really
what air traffic controllers dois they get an aircraft into their sector
and they need to fly it back outto the sector at a certain location
or rough location and a certain altitude.
And so we've simplified, simplified itsomewhat, and we've made it so.
You can keep it input straightup down, right.
But you can
(39:46):
that is the game really.
So we've done it as simple as we canand we've had some visualizations
that you don't get in real lifewhere the aircraft
you can, you click on an aircraft,
you can see where it's heading toand things like that.
So it's it is
it is it pretty much air traffic control
with a degree of oversimplification,but it captures the essence.
And, you know,in the British Science Festival
(40:08):
and in other situations,people have picked it up very easily.
And it's quite nice
because it shows the ideaof the digital twin is that it's modular.
And so in a digital twin,we can have it fed by the real world data.
You can have it fed by
real weather data,
but you can also thenreplace certain components.
So if you want to say, well,what would happen
(40:29):
if it was twice as windy,you can double the wind.
Or you know,
if you want to change the sector shape,you can change the sector shape.
So essentially what we've donethere is taken the engine
that we've got changedsector over simplify some of the rules.
But yeah, it's nice with the feedback.
We've got to be very positive on itand can any of you be the agent
(40:51):
So not me not to,not to say okay, okay. Yes.
You're, you know, so I imagine well,are you better than
the general public maybe haven't spentas much time playing game.
I surprisingly it's very it'sgoals from the public.
Yeah.
Some people are quite like me
and can't keep track of ten aircraftat the same time but to ten them.
(41:11):
And then there was a couple of people thatyou develop any prior experience with.
Incredible.
But they theyyeah I obviously have a predisposition
to being able to track lots of objectsat the same time.
I tried to get one of them to applyto become an air traffic controller.
I'm not sure if they took me up,but I hope they did.
Well, hopefully they weren't too concernedthat their jobs are going to be
automated or that
(41:32):
that's a long way off, as you said.
Yeah, it's going to be an assistance.
Yeah, well, thinking of that.
So let's talk a bit now about whatthe next steps of the project might be.
Say yeah, what, what,what are the next steps going forward?
What's the timehorizon of, of what you've got left
with this collaboration and, and so on.
(41:52):
So yeah,
so I mean got two and a halfto three years left.
Right. So we're kind ofnearing the halfway point.
So far we've looked at two sectorsthat's allowed to start.
Yeah. The human sectors are increasing.
The sectors we want to really lookat a number of things.
(42:13):
We want to first of all improvethe performance of the agent.
So the agent mostly was safe,
which, which would fail the courseand get thrown off, of course.
So it's good,but you know, not good enough.
We'd like to get that always safe.
Always safe means that it would passthat element off the air
traffic control training course.
(42:33):
But then in terms of other things,we'd like to expand what we're looking at.
So right now
we're looking really at controlone sector and performance in one sector.
We want to continue.
That's sort of the maincore of the project.
We want to continue looking at that
improves the fidelity of the sector,improved performance of the of the agents.
But we also want to start looking at
(42:55):
multiple sectors
so you can say, well, what happensif we put 12 agents in 12 sectors or
one agent in all 12 sectors or, you know,
six agents, six humans
In terms of this simulation,
what about things like incorporating
unexpected emergencies, extreme weatherin this kind of stuff?
(43:17):
So so we want to do whathave you talked to achieve control of?
They'll say, yeah, you could probablyor might be able to do it,
but you'll never copewith extreme situations.
Storms in particularwhen there's a storm front
and essentially hailstones in the clouds,aircraft can't fly through it.
And so they will just phone up and say,I'm turning right.
They are trying toI deal with and it's a big match
(43:38):
because the skies were alreadypretty much as full as they can be.
Suddenly everyone's flying offfrom whichever direction they
I heard a news story and I don't know whatthe reason for this was about some plane
that was flying from Manchester to Dublinand got diverted to Paris.
So yeah, so that's a big weather front.
Yeah.
Essentially that the aircraft couldn'tget round. Yeah.
(43:59):
And that or there was no capacity for itto land
any into the animalthat it could get round the weather.
So I'm guessingif you're going to try and simulate
something of that levelthat's, that's many sectors across.
But yeah. I mean we were
we've
now we should have by the summerall the sectors in it.
We still might be looking first offerat the performance in a single sector
(44:21):
in that we haven't definedexactly what we want to look at first, but
the digital twin unithas all of the UK in it, all the sectors,
so you can put a big storm front
across the whole thingand, then see how it starts working
with that.
We also want to look, you know,there's things like decompression.
So if an aircraft has a decompression
(44:41):
that there was the one fairly recently in I think it was in America,
where the aircraft certainlyhas to descend very quickly
and you have to deal with that.
So these are the ones where you've got,
you know, broken radios or nonresponding aircraft.
So there's a lot of differentunexpected situations.
So we just like to explorethe performance of the agent.
(45:03):
They might be easy to handle.They might be very, very difficult.
We want toyou know, it's a basic exploratory thing.
But by, you know, over the next year,I would like to explore all these
unexpected events, stormfronts, multiple sectors.
And the final thing is Georgementioned before, we've got
air traffic controllers normallywork in pairs where you've got one who's
(45:24):
actually the tactical controllerissuing the instructions to the aircraft
and you've got to coordinate
who is looking at the best way of passingaircraft from one sector to the next.
So by having multiple sectors,it allows us to start
looking at that functionalityin terms of agents as well.
So in terms of human people, and that'sthis like a two different roles there.
(45:49):
So so you can simulate thatwith two different agents.
I'll pass over to Joel.
I'm not I wouldn't like to speculatewhat the best way to do it is.
Yeah, it might be.
But is is that that's how he isdone at the moment with the way Yeah.
Syndromes. Yeah.
But it could be that we have oneUber planner who plans the entire UK
won't go
well I was thinking there's
(46:10):
going to joining a competitionbetween sectors so we've got a
it's quite interestingif you've got one agent and one sector
who wants the best for itselfand you want Asia for another sector,
he wants the best for it.
So right then there's going to be
have to some sort of compromise,which is how it works in real life.
And modeling and optimizing that I thinkis quite tricky, both interesting problem
(46:30):
and we don't just want themto be individual entities.
They all need to be able to interactlike in the real world
with their tactical person.
So do you envisage mean Motors
looking beside the future now,perhaps beyond the end of this projects?
But if in the real world,if such an agent ever was deployed
(46:51):
as like an assistantfor air traffic controllers,
do you think it's more likelythat there would be like sort of one
master agent that all of the human airtraffic controllers are
in collaboration with,or would they all be having separate ones
for their sector, which is a bitmore sort of like the simulation
you're doing at the moment, Butor is it just too difficult to know that?
(47:11):
I think I think it's too early to tell,right?
Yeah. I mean, I think the real answer iswhatever works best.
Yeah, it's kind of a right. All right.
But yeah,I think we'll know in about two years
or we'll have some ideaabout what's possible.
Yeah, I think.
And if any of them are possible,
then we'll try and explorethose as much as we can,
because I think I mean, I thinkwhen we've discussed, we've sort of
(47:33):
we all sort of feel like one agent to doeverything is likely to perform better.
But on the other hand,
if you're doing the optimization agentthat's optimizing 700
aircraft simultaneously
as it's currently
written, doesn't scale up to 700 aircraft.
So, you know, given the technologywe've currently built,
the only possible real way to doit would be
(47:55):
one per sectoror maybe one or two sectors.
But, you know, we're very early days.
George hasn't startedlooking at multiple sectors.
No one else.
We haven't really startedlooking at multiple sectors yet.
So too early to see where we can go on it.
CO Well then, yeah, before we wrap up,maybe I'll go to each one of you and say
and what, what are your hopes and dreamsin this field?
(48:17):
You know, where do you,where do you see like in decades to come,
Is this going to be somethingthat you hope
everyone's going to be using in the airtraffic control space?
Well, it's like if everything worksperfectly, hands on.
Okay, I'll go to Nexus.
So great.
Yeah, good question.
(48:38):
I think
possibly it's the sort of chipthat if traffic levels continue
to increase at the rate that the projectssay, it will come to a point
where airspaces are just so congestedthat we can't conceive the humans forever.
Right. Right. And with this
that's looking,like you said, way off in the future.
(49:02):
So there's some truly what are.
All right.
So James,I guess from from my point of view,
first of all, just in termsof congestion,
20, 30 years ago,EDGE, I'll make up some numbers
that'll be incorrect for checks.
But, you know, 30 yearsago, advocates were controlling,
you know, six aircraft each. Yeah.
(49:24):
As more and more toolshave come available, you know, the radar
now automotive labels aircraftthey can handle
you can handle moreso now you can handle roughly 24 number.
I tend to sayand I suspect that the tooling
will allow us to handle more per human
in terms of what I hope we do here.
(49:44):
But I think that to the digitalI took control of and work
as a universal engine in the backto give you advice and tools.
And so that would bemy hope in the medium term
is that some kind of, you know, spin offor continuation from this project
will give us that the toolsthat we need to build those things on.
(50:05):
It also helps to push validationregulation.
We haven't mentioned it too much,but those things don't exist,
so we need to stop pushing those thingsforward.
Right? It's hard.
There's a lot of chat about howto regulate, how to validate these things.
Really how to do it before anythingexists.
So, you know, that is build somethingand say, how do we regulate that?
Sure.
(50:25):
So I'd like to see the regulationfurther fall within ten years time. And
Some offshoots of
this with toolssort of being powered by it, I'd like.
And I guess that that fits in a bit withwhy you're thinking about Explainability
from the start, almost. Yeah.
Yeah. So
yeah, I mean, Explainability is largely
(50:46):
so that we can make sure that
the what we need
to understand need to trust itwhen it's great actually safe.
Yeah. Yeah.
And, and yeah.
George And the other thoughtsI think my colleagues
have pretty much said everything that wasI was hoping to go first.
Before you go, I was going to sayI think so.
(51:06):
There's lots of thingsthat they've both covered in the feature
that we'd like to think about.
And I think the other thingis just as sort of a legacy of the project
in terms of we're sort the first peopleto really try and this on this scale
and this big regulatory
regulatory impact,if I could say the word,
as well as just being the first peopleto really attempt this.
(51:29):
And it's reallybecause no one else has had the data
and we sort of skirted around the factthere's been data, but not to have
an extensive, extensive amount of data.That's right.
From giving us accessto which will allow us to do this.
And that's reallywhy we're at the forefront
of what's possible at the moment.
And I mean, part of the legacyof the project might be provoking
(51:51):
the questions that future regulatorswill have to ask people
when they try and serviceautomated systems.
That's interesting because, yeah,if some someone from the private sector
just comes on and says, yeah, we've got anair traffic control system, it works us.
And they still know frominvestigative outputs, from your research
about how you can perhaps validatewhether that's true or not.
(52:15):
yeah, I think that's one of the benefitsthe project gives from an academic point
of view to that which is not so regularlyhave big companies come up
and go, We've got this latest shiny ideayou can solve all your problems.
And for Nats,they might not know whether it's possible
or whether these peoplecan actually do it by both of these are a
separate and difficult issues,
(52:36):
whereas when they're workingin an academic collaboration,
we're sort of at the forefront of whatpeople are trying to do at the moment.
So I'm not going to be armed with,look, we've spoken to the experts.
This is what they've been able to do.This is what's possible.
This is what we think we could doin the future.
It very much allows them to sanitycheck everything
they all their contractorsthat come to them with amazing products.
(52:57):
Supposedly amazing find sorry contractors
did. Just to reiterate what George said,I should have really set out
as one of the motivations for Natsat the start is that I think is probably
a problem as such in or in all industriesand I know it is in Nats.
We have a lot of startups that comeand tell us they can do things.
We have a lot of consultanciesthat tell us they can do things.
(53:19):
I personally mostly don't trust them
and so if we are at the forefront,we understand what's possible.
We can build.
People come saying they've got tools,they can do things well.
If we've got a digital twin that we candrop the tool into and say, Well,
let's see how it gets on rightand gives us it
gives us as wellas sort of more regularity with things.
(53:39):
It gives us actual concrete toolsand things that we can use to assess what
what people bring to us.
I think the key thing, though, as well isif they are the people
generating these methodsthat might be cynical,
if they hadn't had access to the datathat you have, like how could they be?
I guess that's my to say, yeah, yeah,
how do they know what they've got isand and it's really interesting point.
(54:03):
Yeah.
Is there anything else
that any of you want to mentionbefore we wrap up another project
where any of the thoughts,
where can people find outmore about it online?
We are in the process of building
a website that is is the true answer.
The best placeis probably the channel website and
there's the touring on the Nats websitehave have a bit of information on it
(54:27):
actually said that yeah a project do
and that's all wouldn'tquite set out to date
but it's rich in a while back we're trying
to we're going through the processof updating everything now
we're making an actual Bluebird website,so that will be the right place.
But at the moment it's the tuning websiteand the Nats website.
Perfect.
(54:47):
All right, well, thanks Nick,George and Mark coming on podcast.
Thank you very much Inviting us shares.
Thanks for listening to this week'sepisode.
The show is hosted by mebecause the homeless editor, Karl Street,
Joe Dunn, Gates,Christine the last and any Clue York
(55:08):
music for thispodcast is produced by Jazmine Sullivan.
You can listen
and follow via the link in the descriptionor by searching Germantown on Instagram.