Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Dr. Moira Gunn (00:11):
The frontiers of
science can only be driven by
data, while the continualemergence of new technology
expands what data can becollected and therefore
analyzed. When it comes tounderstanding the human
condition, both anticipating howdisease develops and diagnosing
and treating medical conditions,what is essential is collecting
(00:34):
data from a broad spectrum ofhumanity over time and
continuing to tend that missionas technology and science
develops. Also key is invitinganalysis, not for just a few,
but for the great global body ofscientists pushing out the
frontiers of science for thebetterment of all. That in just
(00:58):
a few words is the concept andmission of the UK Biobank.
Today, I speak with sir RoryCollins.
Doctor Collins is the UKBiobank's principal investigator
and a professor of medicine andepidemiology at the University
of Oxford. Well, doctor Collins,welcome to the program.
Sir Rory Collins (01:18):
Thank you very
much for having me, Moira.
Dr. Moira Gunn (01:21):
We'll just start
with some first principles here.
Many people don't know about theUK Biobank. In fact, many people
don't know exactly what abiobank is. What is a biobank?
And what's the UK biobank?
Sir Rory Collins (01:35):
Well, UK
biobank is an act of altruism by
half a 1000000 people acrossBritain, and an act of vision
really by the UK funders ofresearch. Back at the beginning
of the century, the MedicalResearch Council, our equivalent
of the National Institute ofHealth and the Wellcome Trust
(01:57):
Charity decided that they wantedto set up a very large, what we
call a prospective cohort. Thatis, they were going to ask, half
a 1000000 people from around theUK to answer lots of questions
about their lifestyle and theway in which they live, the
environments in which they live.To have measurements done on
(02:17):
them, things like height andweight and blood pressure and
their lung function. Andbiological samples collected, so
samples of blood, and urine.
And to agree to allow them to befollowed through their health
records in our National HealthService. So everybody in
Britain, has free point of carethrough the National Health
(02:38):
Service. And all of the dataabout their health is recorded
in their systems. And these half1000000, men and women, aged 40
to 69 at baseline, agreed to usfollowing their health through
those systems. So that we couldlook to see what is it in their
questions?
What is it in those physicalmeasurements, and what is it in
(02:58):
their biological samples thatlead to one person developing a
particular disease and anotherperson not.
Dr. Moira Gunn (03:05):
Well, sir Rory,
your English is excellent as one
would expect. I'm not sureeveryone knows what the word
altruism means.
Sir Rory Collins (03:14):
Altruism is a
generosity of spirit. So these
half 1000000 people have agreedto answer lots of questions,
allow us to make lots ofmeasurements and collect their
biological samples, their bloodand their urine. And to have
access to all of their healthrecords, going back in time
before they joined the studythrough the National Health
(03:36):
Service records. And forward intime as they develop disease in
the long term. They're notgetting any benefit from that.
They're not getting feedback ofany of their individual results.
What they're doing is creating aresource that researchers from
around the world can use tostudy many different diseases
and understand the causes ofthose diseases and ways to
(03:59):
prevent them and treat them.
Dr. Moira Gunn (04:01):
Now it's a
biobank. Does that mean that you
keep the biological samples? Imean, at what point does it
become digital? Because that'swhat we work with today.
Sir Rory Collins (04:12):
Yes. So,
obviously, the questions we ask
and the physical measurements wemake are are various, data
points. What we're doing now iswith the biological samples, the
blood in the urine that westored, in big automated freezer
facility, is pulling out thosesamples and turning the samples
(04:33):
into data. So people will befamiliar with the idea of
measuring your blood cholesterollevel or blood glucose level. So
we've measured lots ofbiochemical measures, like
cholesterol and glucose.
But we've gone on. We've nowdone genotyping of all half
1000000 people. Genotyping meansmeasuring, genetic variation
(04:57):
across the whole of the genome.So you have about 3,000,000,000
markers across your genome. Andgenotyping, you probably measure
about a1000000 of them.
But because you inherit yourgenetic material in kind of
lumps, if you measure a variantat one point, you can actually
estimate what the variants arenear to it. And that can allow
(05:20):
you to look at associations withpart of the genome with,
diseases, but also with riskfactors such as obesity or blood
pressure or blood cholesterol.
Dr. Moira Gunn (05:34):
Now is everyone
in the UK, are they served by
the National Health Service? Andand what are the demographics of
of the people that are part ofthis study?
Sir Rory Collins (05:46):
Everybody in
the UK, has access to the
National Health Service. Andalmost all of health care that's
provided in the UK is throughthis public service. I mean,
private health care isavailable. But the vast majority
of health and particularly,serious health outcomes are
picked up through the NationalHealth Service and through the
(06:06):
National Health Service RecordSystems. So, that allows us to
follow all of the participantsin UK through these record
systems to find out whether theydie and what they die of, to
find out about cancers and whatkind of cancer they have, and
also all of their,hospitalizations for various
(06:27):
conditions.
And increasingly, we're now ableto link into their primary care,
records, which go into evengreater detail about what
investigations they've had, whatdrugs they get. And then, we can
go one step further becausethere are then many disease
specific, datasets. So we canlook at imaging for someone
(06:47):
who's had a stroke, sequencedata for someone who's had
cancer. So So we can get very,very precise information about
many different health outcomesand combine that with a very
detailed information that wehave about the participants and
their lifestyle, their genetics,their environment to really hone
down on what it is that causes aparticular disease and how we
(07:09):
might be able to prevent it.
Dr. Moira Gunn (07:11):
And what are the
demographics?
Sir Rory Collins (07:14):
So the
participants were aged 40 to 69
when they joined the study fromacross England, Scotland and
Wales. Men and women, about 50%each. And then we located our
recruitment centers in placesthat helped us to increase the
diversity of the participantswho joined the study. So people
(07:34):
from urban and rural areas,people from different
socioeconomic strata and also torecruit people from ethnic
minority groups, in accordancewith the proportion in the UK
population. So they areminorities.
They're smaller numbers, butstill really quite large numbers
(07:56):
of people from very differentbackgrounds that we're able to
study within the UK Biobankcohort.
Dr. Moira Gunn (08:03):
I don't want
people to get the idea that
well, that's it. That's what youdo. You just keep collecting
this data. There are manydifferent biobank initiatives.
And I see one that has a goal ofa 100000 participants.
They have their brains, hearts,and abdomens, which I I I'm
(08:24):
guessing are many of their vitalorgans, image with MRI scan.
Tell us about that.
Sir Rory Collins (08:30):
So among the
half 1000000 participants in UK
Biobank, what we're doing istrying to get as much
information about as many ofthem as possible. So we've done
biochemical, analyses looking atthings like cholesterol and
blood glucose. We've donehematological assays, so looking
at people's hemoglobin levels,looking at their white cells and
(08:53):
their red cells and things likethat. We've genotyped everybody,
which means measuring about amillion of the 3,000,000,000
markers across the genome. Butwe've then gone on to analyze
the exomes within the genome.
So this is the 2% or so of thegenome that actually is directly
(09:14):
producing proteins that then arerelevant to the development of
particular health conditions.And finally, we've done
sequencing of the whole cohort.So all 3,000,000,000 markers
across all half 1000000 peoplein UK Biobank have been
measured, with funding from theUK government, charity, and from
(09:35):
industry. Why are we doing that?Because the more data we have,
the more researchers canunderstand the causes of many
different conditions because allof this information by the
participants is then linked totheir health outcomes.
And these data in UK Biobank arebeing made available to
researchers around the world forany kind of research that's in
(09:58):
the public interest. And thereare now some 30,000 researchers
using the data, some 10,000 ofwhom are in the US.
Dr. Moira Gunn (10:05):
Now you are in
the midst of recruiting, the
100,000 participants to havetheir brains, hearts, and
abdomens, scanned. And Iunderstand you were the 60,000th
participant. But tell us aboutwhat that was like. What did you
go through?
Sir Rory Collins (10:23):
Well, we were
very much aware that we had very
detailed information from thequestions to participants. And,
of course, having blood andurine samples, we can do very,
very detailed, analyses of thosesamples, such as genetic
analyses, analyses of proteinsand things like that. But the
physical measurements we madewere relatively crude. So, how
(10:48):
heavy are you? What's yourheight?
So what we wanted to do for atleast a large percentage of the
UK Biobank cohort was to getvery much more detailed
information about people'smeasurements, if you like. So,
100,000 of participants areundergoing a special imaging
(11:10):
protocol. They come back andspend 5 hours, in one of our 4
U. K. Biobank imaging centers.
They do the whole of thebaseline assessment again, the
questions, the physicalmeasurements, the blood and
urine collection. But then theyspend time in magnetic resonance
imaging of their brain for abouthalf an hour, of their heart and
(11:32):
their body for about half anhour. We use low power x-ray to
look at their bones and joints.And then carotid ultrasound,
which allows us to look at theirarteries in their neck to see
whether there's evidence offlaring up of the arteries. So
it's incredibly detailedphysical characterization of the
participants.
(11:53):
And I went through that a coupleof weeks ago, as a participant
in UK Biobank. And I wasincredibly impressed by the team
doing it. It was so wellmanaged, and actually very
interesting. The 5 hours justflashed by, during this process.
But these data, these imagingdata, like the biological
(12:15):
samples, are also being turnedinto data points.
So the images are being providedto kind of engineers, digital
scientists, data scientists. Andthey turn the images into data
points in the same way that say,a laboratory scientist turns the
(12:37):
blood into data points. And thenthat data, thousands of
measurements about the brain orabout the heart, are made
available to researchers tounderstand, are there things
that we measured back in 2,006,2010 when the participants
joined that predict differencesin the brain or heart or body
(12:58):
images? And also then, are thosemeasurements in the brain and
heart and body associated withsubsequent disease? The body
image is really interesting.
So you can have somebody withthe same body mass index,
essentially the shape based ontheir height and weight, but
have very, very differentmagnetic resonance images of
(13:21):
them, showing very different fatdistribution. It might be
superficial under their skin orit may be associated with their
body organs. And you can onlypick that up with the magnetic
resonance. And that's likely tobe very differently associated
with disease.
Dr. Moira Gunn (13:40):
It seems to me
that this is unprecedented.
Sir Rory Collins (13:43):
Oh,
absolutely. I mean, the biggest
imaging studies in the past werea few 1,000 people. The idea of
doing a 100000 people beingimaged is, yeah, absolutely
unprecedented. But so is half a1000000 people being sequenced.
You have to remember that 20years ago, the first person was
sequenced.
Now we've sequenced half a1000000 people. And all these
(14:06):
data are being made available toresearchers around the world.
And they're making extraordinarydiscoveries because they have a
combination of large scaleexquisite detail. And now, of
course, nearly 15 years offollow-up of people's health
outcomes. So there are largenumbers of individuals who've
developed various conditions, sothat one can really understand
(14:29):
what's the cause of thoseconditions and therefore, find
ways in which to prevent andtreat them.
Dr. Moira Gunn (14:35):
Now I see that
30,000 researchers have actually
accessed this data bank, thisbiobank, and, that they've done
work with it. Who are thesescientists? And how do they get
access?
Sir Rory Collins (14:50):
So they
accessed de identified data. So
they don't know who anybody isin in the study. And, they they
have to show that they are bonafide researcher and that their
institution agrees to complywith the rules of engagement
that, to look after the data andto use it for health related
(15:11):
research in the public interest.The researchers then provided
with access to these data to dotheir research. And really, the
only requirement, in thatrespect is that they put their
findings into the public domain.
They publish them in medicaljournals or they make them
public in some other way, sothat they benefit medical
science. And there'd be somereally interesting findings that
(15:34):
have come out. I thinkincreasingly what we'll see is
new targets for treatments andthe work with exome sequencing
which focuses on the part of thethe genome that produces
proteins has already found somenew targets for conditions, for
example, anti obesity treatmentswhich are now starting to be
(15:58):
developed. But I think, perhapsthe most striking observation in
my my mind is the finding thatif you combine dozens or
hundreds of genetic variantsacross the whole genome, each of
which has a small effect ondisease. But if you combine many
(16:21):
of them, then you can identify afew percent, maybe 3 to 5
percent of the population thathave a risk of that particular
condition equivalent to somebodywith a single gene disorder, the
kind of disorders like, BRCA 1and BRCA 2 that many listeners
will be used to hearing aboutwith respect to breast cancer.
(16:42):
Now, that's about 10 times lesscommon. So the idea that this
genotyping, where you measurevariance across the genome, can
identify 10 times as many peoplewith equivalent risk of breast
cancer. But also identify, 10times as many people with
equivalent risk to heart diseaseas someone with so called
(17:05):
familial hypercholesterolemia.And it's been shown that, you
can identify people in the topfew percent of about 16
conditions in about a quarter ofthe population. So that means
that this very simple test, thisgenotyping test, which probably
costs about $10 can identify alot of people who are high risk
(17:28):
of different conditions.
And you can imagine how thatcould get rolled out into
healthcare systems. For example,breast cancer screening is
typically offered to women whenthey get to a certain age. Well,
if we could find 5% who were atparticularly high risk, one
could offer breast cancerscreening to them at a younger
age. And the same would be true,say colorectal cancer or
(17:51):
prostate cancer screening. Andthen from the cardiovascular
perspective, I do cardiovascularepidemiology and clinical
trials.
So I'm particularly interestedin this. The idea that, we give
cholesterol lowering therapy topeople largely when they get to
a particular age. If you couldidentify 5% of people who are
(18:15):
age 30 were at very high risk ofheart disease, then again, you
could offer them protectionagainst getting heart disease,
when they get older. The dataare available to researchers
around the world, on the basisof just the cost of processing
their application to access thedata and providing the data to
(18:35):
them. We've now put all of thedata onto a cloud based research
analysis platform that's hostedin Britain by the Amazon Web
Services.
And, that platform that has beendeveloped for us by an American
company called DNAnexus allowsresearchers to go to the data
(18:57):
rather than us having to providedata to them. And that really
democratizes further accessbecause the data are very big
now. And therefore, to to havethe data come to you means you
have to have a big computer andlots of analytic power on your
computer. Whereas, you can nowcome to our research analysis
(19:17):
platform, and have all thecompute power you need for your
project without having to investin large computers. So it makes
it even more, accessible toresearchers around the world.
And indeed, we have, freecompute from Amazon for
researchers from low and middleincome countries and for all
(19:41):
early career researchers, again,to facilitate access, to the
data.
Dr. Moira Gunn (19:46):
Well, I see that
in 2021 alone, there were 1700
publications in such journals asNature, Science, and many
others. So obviously, there's atremendous amount of output that
goes right into the scientificcommunity.
Sir Rory Collins (20:03):
Absolutely.
The success of UK Biobank, is
down to the research communityaround the world using the data
in extraordinarily imaginativeways. And what we're seeing is
an increasing range ofresearchers from around the
world using the data. So one ofthe problems, of course, with
health data is getting access toit. Well, for the first time,
(20:26):
really, a very large, verydetailed, health research
database is available toresearchers.
And what we're trying to do isencourage you really clever
scientists out there, perhapsnot the ones who are most used
to using health data, to comeand use it. We really want to
see the people who dofantastically interesting things
(20:49):
with astronomical data orphysics data, other data
scientists who may be not usedto working in the biological
field coming and solving majorhealth problems by accessing
these data. They're now readilyavailable at scale and in-depth,
which is unprecedented.
Dr. Moira Gunn (21:07):
Well, I would
imagine that once COVID appeared
on the scene, that the UKBiobank sprung into action. What
did you do?
Sir Rory Collins (21:16):
Well, one of
the things we had to do, of
course, was stop our imagingstudy, because we couldn't
invite healthy individuals tocome in to the imaging centers.
So we thought, well, what are wegoing to do with all our imaging
staff and with our imagingcenters? And because we had
already imaged 50,000 of theparticipants, we decided with
(21:38):
the support of the MedicalResearch Council and the
Wellcome Trust to swivel towardsstudying COVID. And so, we
invited back a 1,000 of thepreviously imaged participants
who had been infected and a1000of the previously imaged
individuals who had not beeninfected. And we re imaged them.
(21:58):
So we created a unique datasource where we have a thousand
people with pre and postinfection imaging and a 1,000
match controls. There is no suchdata available anywhere else in
the world. And then we've madethose data available to
everybody. And the value of thatis that you can look at the
(22:21):
effect of infection on changebecause you have a pre infection
measurement. And and that, isbeing used increasingly.
We also made the data availableto researchers all around the
world to look at, what were thedeterminants of a bad outcome
with infection. And there wereabout 200 publications, on UK
(22:45):
Biobank related to COVID,identifying some of the major
determinants of a bad gapoutcome. Some related to
ethnicity, to body shape, toother underlying conditions in
those individuals. And again, bymaking those data readily
available, rapidly identifyingimportant determinants of bad
(23:09):
outcome and the imaging studyhelping us to identify what the
infection does to people interms of their brain function,
their heart function, etcetera.
Dr. Moira Gunn (23:21):
So doctor
Collins, where does the UK
Biobank go from here? Are thereplans?
Sir Rory Collins (23:27):
Absolutely.
The next phase of UK Biobank
will be about studying otheraspects of these blood samples,
the proteins, the metabolites,the changes that occur in
response to exposures to yourenvironment, to your genes
themselves, so calledepigenetics. All of these things
(23:50):
will help us to understand howit is that risk factors, be they
genetic, lifestyle orenvironment, lead to disease. To
understand the pathways and themechanisms, and therefore,
really understand better how toprevent and treat those
conditions. And I think thatwill be the next phase of UK
Biobank.
We'd like to bring back all ofthe UK Biobank participants so
(24:12):
that we can get measurements inall of them. And therefore, we
can look at change in riskfactors on disease. So there's
plenty more to do to make UKBiobank more valuable for for
researchers. And the great thingis that we have half a 1000000
really altruistic, reallyengaged participants that want
(24:33):
to make this resource as good aspossible for the benefit of of
humanity.
Dr. Moira Gunn (24:38):
Well, doctor
Collins, thank you so much for
joining me. And, truly, I hopeyou will come back and see us
again.
Sir Rory Collins (24:45):
I'd love to do
so. Thank you very much, Maura.
Dr. Moira Gunn (24:47):
My guest today
is sir Rory Collins. Doctor
Collins is the principalinvestigator of the UK Biobank.
More information is available onthe web atukbiobank.ac.uk. For
Tech Nation, I'm Moira Gunn.