Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:03):
And so, we put in two pages on ethical,legal, social implications, and
how we were going to do the IRB.
And in the process, this is thefirst time I've actually done an IRB.
I'd collaborated with peoplewho had done IRBs before.
This is the first time I did it.
And I was just appalled at some of thethings that they were forcing you to do.
They, they said like, you have topromise that you'll keep it private.
(00:25):
And I said, I don't know how tokeep my credit card private, much
less somebody's medical records.
And in fact, there had been numerous,uh, hacking events and people losing
disk drives at various hospitals in theBoston area that will remain nameless.
And so I said, I think the betterassurance would be that we get people
(00:45):
who are okay with it being public.
That was one thing thatI was horrified by.
The other one was they said you couldn'treturn information to the patients,
even if it could save their life.
And I felt that was crazy too.
And I said, well, but a consequenceof their data being public is
it also be available to them.
And so they can.
(01:05):
You know, if we give them the rightsoftware, they can draw their own
conclusions and with proper consultingwith their physicians and specialists,
they will make the right decision.
And so we framed it as an ethics studyas well as a technology study, and that
became the Personal Genome Project.
(01:26):
Welcome to another episodeof NEJM AI Grand Rounds.
I'm your host, Raj Munrai, and I'mhere with my co-host, Andy Beam.
Today, we're delighted tobring you our conversation
with George Church of Harvard.
George is a professor of geneticsat Harvard Medical School and
professor of health sciences andtechnology at Harvard and MIT.
He has many other titles and honors.
(01:47):
Andy, I was really struck by boththe depth and the breadth of cutting-
edge science that George overseesin his laboratory, and also by how
effectively he must context switch.
He's very well known for his pioneeringwork in genomics and synthetic
biology, his involvement with theHuman Genome Project and the Personal
Genome Project, his work on geneediting, and even revival of the woolly
mammoth, just to name a few areas.
(02:09):
He also shared aspects about hiscareer journey and how he approaches
both academic and entrepreneurialefforts, which were enlightening.
You know, Raj, George has tobe one of the most fascinating
individuals I've ever met.
I think when most of us decided thatwe wanted to be scientists, we actually
probably wanted to be someone like George.
He's always working on the mostintriguing things, and I'm constantly
amazed by the range of truly sci-fiprojects that he has going on.
(02:32):
I really love that he sees his missionas making the future happen faster.
You know, and that way he really remindsme of a scientist from a movie almost.
He's got this big beard, this bigpersonality, and he does crazy things
like try to bring back the woolly mammoth.
I always enjoy hearing aboutwhat George is up to and this
conversation was no exception.
The NEJM AI Grand Rounds podcast issponsored by Microsoft and Viz AI.
(02:55):
We thank them for their support.
And with that, we bring you ourconversation with George Church.
Welcome to AI Grand Rounds, George.
We're happy to have you here.
Ah, it's great to be here.
So, George, a question weask all of our guests.
Tell us about the training procedurefor your own neural network.
How did you get interested inscience, and in AI in particular?
(03:17):
What data and experiences ledyou to where you are today?
I've heard there is a frameableDuke letter as part of this story.
Yeah, so I got interested inscience, both natural and unnatural.
My third father was a physician, and so Iwas fascinated by the technology in his...
He would do house calls back then.
(03:37):
Bit of history there.
And then I went to theWorld's Fair and saw...
Computers and robots or animatronics.
And I decided that was definitely cool.
And I came back to Florida wherethey didn't have any science classes
to be taken, that I could find.
So I just started building myown computers because I was
desperate, not very good ones.
(03:59):
And so it wasn't until ninthgrade that I actually got
access to a GE 635 at Dartmouth.
Been programming ever since,basically, that was actually a
time sharing interactive one.
This is in 1968.
I'm in ninth grade.
Then when I went to college at Duke,it was kind of a step backwards,
forwards in terms of science.
I was doing crystallography, butbackwards in terms of, uh, everything
(04:22):
was punch cards into IBM 360.
And then I got two major degreesin two years and then proceeded
to flunk out of graduate school.
So they gave me a niceletter, hoping that I would
do well in some other field, but Istuck to this field and, my lab has
been mostly, mixture of technologydevelopment and computational biology
(04:42):
pretty much since the beginningin 1986 at Harvard Medical School.
And naturally as the names kept changingas to whether it's neural nets or deep
learning or machine learning, but we'vebeen part of that revolution, at least as
applied to things like molecular biology,
protein design in particular.
Thanks.
That's super interesting.
(05:03):
And I actually didn't knowthat you had such an early
interest in computer science.
And I can see that writing code, whetherit's in bits or in DNA has sort of
been with you since the beginning thatyou're still programming the software.
The substrate has just changeda little bit over time.
Yeah, yeah.
My first computers were an analog computerand then a digital mechanical computer.
(05:24):
So, it was only the third onethat was, was a real von Neumann,
digital interactive computer.
We're going to hop into your researchin just a little bit, but there's an
icebreaker question that I think I'mobligated by law to ask you, and it's
about bringing back the woolly mammoth.
And, so, you've been on record on, therecord as saying that we should
bring back the woolly mammoth.
And I'd love to hear, the motivationsfor doing that and sort of how far are
(05:49):
we away from being able to do that now?
Well, this is not going to fit into 140or 280 characters, but I'll try my best.
So, it's really not about bringing
back an extinct species, we're
mainly interested in endangeredspecies and restoring ecosystems.
The ecosystem that seems like it needsthe most help is the Arctic, where there's
(06:12):
a lot of carbon sequestered because ofthe cycles of freezing and then summer
photosynthesis results in sometimes 500-meters thick of topsoil compared to say
1-meter in a lot of the rainforests.
And that's all melting and it's meltinga lot of it in the form of methane, which
is 80 times worse than carbon dioxide.
(06:34):
And so we're hoping that making cold-resistant elephants, which would help
save them from their endangered speciescategorization and would also help
restore that when we're not sayingthis is the only answer or even
this is a good answer.
It's just our best shot at increasingsequestration and decreasing
(06:55):
the loss in the form of methane.
And we're doing that by genomeengineering, which we've already
done in pigs for making transplants,xenotransplants, which are making
their way through now 600 successfuldays in non-human primates and some.
And now going into human clinicaltrials, we can do dozens, uh, 42
(07:20):
edits in pigs genome germline.
We hope to be able to do the samething in elephants and make them cold-
resistant, maybe also resistant toherpes viruses that are killing them.
If you have to do 42 edits for pigs,what is the difference in scale to do
something to like turn the elephantinto a cold-resistant woolly mammoth?
(07:42):
A cold resistant elephant, that has genesthat are resurrected from wooly mammoth.
We've already resurrected twogenes that seem to have the right
properties when resurrected.
We don't know the exact number.
We're doing a computational analysis ofdozens of elephant and extinct relatives
(08:03):
and focusing on genes that go to fixation.
That means that they're homozygous,that both their maternal and paternal
alleles are the same, and both ofthose are different from the existing elephants.
And so, basically, you've got this branching inthe phylogenetic tree where it's gone to
(08:24):
fixation where all of the mammoth genesare different from the existing ones.
And we can do that for multiplemammoths and multiple existing genomes,
both African and Asian elephants.
And then we're focusing on those, both thecoding regions and the non-coding regions.
And if we did all of them, it wouldbe in the hundreds of thousands.
(08:47):
If we just do the ones that we thinkare involved in cold, a priori, well
then those could be in the dozens.
So, it's somewhere in between.
And we have technologies, editingtechnologies, I think that are up
to the high end of the spectrum.
Our record so far for repetitiveelements is editing 24,000
edits simultaneously in one cell.
(09:08):
That happened to be a human pluripotentstem cell, but it could have been any
mammalian genome as far as we know.
Got it, thanks.
So, George, I want to switchgears to your work in genomics.
You're a pioneer in genomesequencing, genetics, synthetic
biology, many other areas.
And your work has been influentialon many scientists, myself included,
(09:32):
but I actually want to ask youabout a different perspective, which
is your perspective as a patient.
You famously made your own genomesequence, your lab values and medical
records publicly available foranyone to download on the Internet.
I want to ask (09:46):
How did you come
to that decision and what you
might have learned since you'vemade that information available?
Yeah, it's an interesting question.
So, we, uh, applied for NIH Center ofExcellence in Genomic Science and got it.
And in the process, we were proposing todevelop a new way of doing DNA sequencing.
(10:08):
Now, this is in 2003 and wewere partway there anyway.
So that's why we proposed it.
It's now called NextGen sequencing,but back then it had other names.
And we proposed that in theprocess of the five-year Center
of Excellence grant, we would
get as far as doing a 1.7
million base pair bacterial genomes.This is a very tiny bacterial genome,
(10:32):
not the smallest, but very tiny.
It was Helicobacter pylori, the causativeagent of isthemic cancers and ulcers.
And we had already contributedto the first time that was
sequenced back in 1994-ish.
So, in 2003, we proposed to do thatagain after five years of work.
As it turned out, eight months intoit, we had finished something three
(10:55):
times bigger than that, and by the endof the project, we had finished five
human genomes at six billion base pairseach, so diploid, high-quality genomes.
Now that it was not completelyunanticipated, we thought we might
get the human genomes, even thoughwe were only promising to do a really
tiny bacterial gene, a thousand timessmaller, we thought we might get there.
And so we put in two pages onethical, legal, social implications
(11:20):
and how we were going to do the IRB.
And in the process, this is thefirst time I've actually done an IRB.
I've collaborated with people who'vedone IRBs before but this is the first time I did it.
And I was just appalled at some of thethings that they were forcing you to do.
They said like, you have to promisethat you'll keep it private.
And I said, I don't know how tokeep my credit card private, much
(11:41):
less somebody's medical records.
And in fact, there had beennumerous, uh, hacking events and
people losing disk drives that
various hospitals in the Bostonarea that will remain nameless.
And so I said, I think the betterassurance would be that we get people
who are okay with it being public.
That was one thing thatI was horrified by.
(12:01):
The other one was they said you couldn'treturn information to the patients.
Even if it could save their life.
And I felt that was crazy too.
Uh, and I said, well, but a consequenceof their data being public is it also
be available to them and so they can,you know, if we give them the right
software, they can draw their ownconclusions and with proper consulting
(12:23):
with their physicians and specialists,they, they will make the right decision.
And so, we framed it as an ethicsstudy, as well as a technology study.
And that became thePersonal Genome Project.
That's fascinating, that history.
I'm curious, so you also, as part ofthis, you made your own genome public.
Oh yeah.
And I think, to this day, you can goand download a copy of your genome
(12:48):
as well as your medical records.
And just connecting the dots, you wereone of the participants who was willing
to have their genome released, right?
You were a participant in your own.
That's correct.
The IRB felt that, I mean, it was anunusual proposal to the IRB because I was
breaking at least two of their rules, afew others that we don't need to go into,
but, but, you know, transparently andpolitely, and it took us about a year to
(13:13):
negotiate it, but that's not that longcompared to other IRB approvals, even
non-controversial ones. But part of thatis they said, well, you know, you should
be willing to eat your own dog food.
Uh, you know, we want youto be the sole participant.
And I said, well, howabout 10 participants?
Let's compromise.
And I'll be the first of the10 and then we'll expand it.
(13:33):
If nobody gets hurt, we'llexpand it from there.
And John Hovka was actually number two.
Uh, Esther Dyson was number three.
All their names were known.
That was approved by the IRB as well.
And John was at the time just brand-newCIO at Beth Israel Deaconess and also
had crafted the, I think it was Norway'sgovernment's medical informatics policies,
(13:55):
and so he was really perfect second.
In fact, there's a funnystory about my posting.
So I was a patient at Beth IsraelDeaconess, as it turned out, and before
we had the IRB approval, I had just postedmy own medical records as a test, and
some patient had been looking, browsingthe Internet, and had run across my
medical records, and she freaked out.
(14:16):
She thought, this is standard policy at Deaconess,was to release patient medical records
onto the Internet, and her complaintworked all the way up to the president.
And then back down to JohnHolomka, who said, oh, I know what
this is, this is George Church.
And he contacted me, andI said, oh yeah, yeah.
Oh, I didn't even think thatsome patient would find this.
(14:36):
I said, put a big disclaimer on it.
I said, this is not Beth Israel policy.
This is just an experiment.
And then they thought that was okay.
And we went forward.
And so have you learned anything aboutyour own genome since it's been publicly
available or your own medical recordssince you've made these widely available?
(14:57):
Oh, yeah, sure.
There have been various advantages.
There was one case whereI was in giving a lecture
in Seattle and somebody in the front rowsaid while we were waiting for everybody
to settle in, you know, there was kindof a pause and he said, you know, you
should get your blood work checkedout because according to your public
(15:20):
medical records, you're on statins andthere's no evidence that you've checked
on your cholesterol or on the possiblenegative consequences for muscle damage.
And I said yeah, you're right.
I haven't been checked.
And so I went off and I checkedand actually there was no lowering
of cholesterol and there wasn'tmuscle damage, so, or potentially
(15:42):
the biomarkers for that.
And so we changed the statin and thedose and monitored the, that probably,
you know, gave me an extra 10 yearsof life for not using drugs right.
And then, you know, I learned that Ihad, risk factors, for Serpina 1, for
Alpha 1 antitrypsin, which put me at riskfor a whole variety of, respiratory
(16:04):
problems.
And so I've been cautious aboutliving in, downtown Beijing and Los
Angeles and things like that, or, orhanging around with COVID too much.
Great.
So we're going to switch gears just alittle bit again and talk about some
of the work that your lab has beendoing in artificial intelligence and
deep learning for biotech applications.
(16:26):
So, I've, you know, I tried topick one paper to focus in on,
but you've done so much work
on things like protein engineering,designing AAVs for gene therapy,
improving the efficiency of CRISPR.
So, I was hoping you could pick onefrom the list of the many papers
that you've written in this areaand help us understand what machine
(16:47):
learning is bringing to the table.
So, I think there's often a lot ofconfusion about when and how and why to
use machine learning, but I was hopingyou could really help us zero in on
the types of new questions that machinelearning has let your lab ask and answer.
Yeah.
So, the one that I would pick,or the two, the first one was.
(17:08):
The one that I would pick is, is aNature Biotech paper we published
in year 2000, which is Bryant et al.
Eric Kelsick, uh, was a postdoc in my lab,who was a senior author on that paper.
And it was titled Massively ParallelDeep Diversification of AAV Casted
Proteins by Machine Learning.
And we have been working on machinelearning for protein design for
(17:31):
quite a while at that point,but that was different in that
it illustrated your question.
Your point was when is thetime to do machine learning?
When is it not?
Typically there's oneprerequisite for machine learning
is you have a lot of data.
We had just published some
paper that we refer to as low-end machinelearning, but it isn't entirely low end,
(17:52):
and there's some additional backgroundinformation about proteins in general, but
low end for your particular experiment.
But, but anyway, that's the prerequisite.
You need to have a lot of data, typically.
Then there's the questionof when is it better?
So, in this particular case, I think weshowed that it's a lot better because
we did a comparison of naive or random,semi-random models for mutagenizing
(18:17):
a stretch, a key stretch of 28 aminoacids that's important to the AAV
capsid for gene therapy delivery.
And we asked, you know, how many aminoacids can we change simultaneously?
And the reason you might want to changea lot of them simultaneously is that's
where the immune system interacts with it.
And if you want to use the genetherapy more than once and you want
(18:40):
it to be or even once, you want tomake sure the immune system doesn't
attack your precious therapeutic.
Also, if you want to target a new tissue,you might have to radically change
the surface of the virus so it will
target tissue A and not B.
So, those are objectives
and we, in the naive model, it was veryhard to get more than four changes out of 28.
(19:03):
And this was very consistentwith all the work we've done
and many other proteins before.
Four out of 20 was actually apretty good day in one round of
diversification and selection.
But using logistic regression or variousconcurrent neural nets or various neural
nets, uh, we could get up to, we couldget about 90 percent at 25 out of 28.
(19:27):
About, just doing this frommemory, 70, 80 percent at 26 at,
and even 20 percent of this biglibrary, focused on 28 out of 28.
So, it wasn't like one solution,it was lots of solutions.
Well, at four out of 28 with the naivemodel, you are getting close to 0%.
So that to me is a dramaticvalidation that this is an
(19:49):
improvement over the naive models.
Yeah, so this all started with anundergraduate in my lab, Harvard
undergrad, Ethan Alley, andSerge Biswas, and others, Pierce
Ogden, who had developed UniRep,which was a language model.
So, there's a kind of two broad categoriesin protein structure and design.
(20:10):
One is focusing on the structure.
And that got a lot of attentionwith AlphaFold from DeepMind Google,
but it just predicts the threedimensional structure of the protein.
Just, I should say that's a50-year holy grail, but from a
protein design standpoint, knowingthe three-dimensional structure.
is just a starting point and notnecessarily the right direction.
So, for example, I'm just going toend on this example,
(20:34):
just contrasting the language model,which is essentially linear with the
3D structure, is if you, if you have aserine protease, which is named because
serine at the active site is critical,and you change that serine to an alanine,
so now it's completely dead, but it hasexactly the same three-dimensional structure.
Now there's the version of this alphafold, which is alpha fold multimer,
(20:57):
where you can ask whether two proteinscan stick together, and that might
save you in certain circumstances. Butbasically, point is you can kill a protein
without changing its three-dimensionalstructure, but not in the language
model. It's too, it's sophisticated stuff.
Right, right. So, that I think dovetailsnicely into a follow-up question that
I'd like to ask is (21:17):
you've seen many
revolutions in biology, I think it's
fair to say genomic sequencing, syntheticbiology, could you help us understand
how AI is going to make an impactin biology over the next 10 years?
And the sort of flip side of thatis, are there areas of biology, are
there corners of biology that youthink are immune from sort of the AI
(21:40):
incursion that we're seeing today?
Yeah, so, uh, I kind of have a policy notto say anything's impossible because it's
one way you can show how foolish you are.
I mean, I'm not questioningwhether I'm foolish.
It's hard to prove a negative.
Yeah, and, and it's easy to getembarrassed, two years later when
(22:01):
somebody shows that it does work.
So, and in fact, almost everythingthat I've worked on, every project
that worked on in my 68 years hasbeen, 50 years as a scientist.
One person or other hassaid it's impossible.
Sometimes a couple of years afterI published peer reviewed paper on
it, they still say it's impossible.
But anyway, so I, I can't thinkof anything where, you know,
(22:23):
the laws of physics or computingwould prevent it from making a
contribution.
May I qualify?
Is there an area of biology you think has low probabilityof being impacted or is just too
hard for AI over the next 10 years?
So I've relaxed thecondition a little bit.
Yeah, I still think it's, I, Ithink they're all sufficiently high
(22:46):
probability nonzero, uh, that they,that they're worth considering.
I think things where it's really hardto get data, but on the other hand,
it's not clear any kind of intelligence,artificial or otherwise, is going
to solve some of those problems.
So yeah, I think there'sa good room for it.
It doesn't mean that human beingsaren't going to play a gigantic
(23:09):
role in nudging it, programming it,
interpreting it, interfacing withpeople who don't need to know all the,
the gory, uh, computational details.
So yeah, I think it's goingto affect everything really.
And I, and I think the same thingis true for synthetic biology.
Synthetic biology is going to
(23:30):
affect everything, includingmaking computers that might be
better than current computers.
They'll probably be hybridcomputers of various sorts.
In a certain sense, machine learning isbased on, is inspired at least by, um,
natural neuronal computers, and I thinkhybrids will have a good shot at it.
(23:53):
If it was anyone else, I would say thatwas a hedge, but when George Church
says never say never, I believe it 100%.
Thank you.
So, George, I want to ask you about...
Is there a field whereyou think it won't work?
Or has it worked?
I've just heard a luminary of thefield, uh, advise me against, uh...
(24:13):
I should have asked you first.
So George, I want to askyou about commercialization.
So you've been involved in many companies.
I think a few that are relevant toprotein engineering and AI include
Manifold Bio, Nabla, Dyno Therapeutics.
Could you tell us about what thesecompanies do and maybe how their
(24:35):
missions relate to one another?
Right, so the three that you mentioned,that Nabla, Dyno, Manifold, and two
more, Patch and Shape Therapeutics,all are using machine learning for
protein or nucleic acid design.
They're wildly different.
So Nabla is focusing on antibodies, whichare one of the key therapeutic categories
(24:57):
and have diagnostic uses as well.
Dyno is on delivery.
We were just talking about AAV.
Manifold is very interesting and they'redeveloping ways to make protein binding
pairs, but that's not the end game.
Then those can be used for testingmultiple protein therapeutics
(25:17):
simultaneously in one, let's say,expensive test animal for preclinical
trials, let's say a non-human primate.
So, you can do thousands of simultaneousmeasurements of pharmacokinetics and
dynamics and tissue targeting andso forth, all at once with protein
therapeutics, which don't normally have anucleic acid barcode that you can follow.
(25:42):
So this is a protein barcode.
And then Shape is workingon RNA therapies, tRNAs,
and ADAR, and also delivery.
And then Patch is on cisregulatory elements, DNA and RNA.
And there could probably beabout 10 more of them that
would not overlap one another.
Uh, it's a, it's a very important,subset of things that you
(26:03):
can do with machine learning.
So, you know, as you think about thosecompanies that are making use of machine
learning and artificial intelligenceand the others that you've been involved
in, I'm curious, have you identifiedmaybe a set of, let's say one or two
key questions that allow you to decidewhether an idea is right as a commercial
(26:24):
entity or whether it maybe belongs inacademia for more development before it
moves into, into a commercial entity?
Yeah, this is a really tough callthat, that every postdoc that wants to
start a company and the PI that maybewants to start it with them or wants
(26:46):
to jump from academia into industry.
And I say, it's not so bad to benumber two, to be the second one
in the field, even if you thoughtof it first and it kind of
irritates you that somebodyjumped in there with your idea.
You just want to make sure you'vegot enough intellectual property
that you have freedom to operate.
And it doesn't really matter whogets funded first necessarily.
(27:08):
And very often the secondone does a better job of it.
Either from a business side ora science, engineering side.
So that's part of the decision.
The part where you know that it'sready is either you're getting a
lot of feedback from your peers.
They like it.
They want it.
One case, you know, when we weredeveloping DNA synthesis, the very
(27:31):
high throughput DNA synthesis, like10,000 times previous throughput.
And suddenly we had a lot offriends that wanted to buddy in
and collaborate on making big
DNA constructs cheaply.
And we said, oh, this is going to bea tremendous academic distraction.
We really have to spend thisout just for our own sanity.
So we did that samething with, with CRISPR.
(27:53):
That was clearly going to be so popular.
I mean, we, we announced it inJanuary 2013, and by March
there were like 10,000
users, we distribute it through Addgene,which is nearly free, and that would have
been a real hassle to have done that.
Licking stamps in our office.
So, those are the kind of thingsthat tell you that you're ready.
(28:16):
You don't have to rush.
You can, the more mature it is, theslower you will get diluted out.
Where, if you go there too early, youmay think it's magic money, it's easier
than a grant, but pretty soon it'llbe taken away from you and, and it may
not go in the direction you want itto go once you lose control over it.
I, I like the emphasis onit's not so bad to be second.
(28:38):
Uh, there's a saying that I'm remindedof that the early bird gets the worm,
but the second mouse gets the cheese.
And that sometimes beingsecond is strategically good.
Oh, right.
That's good.
Second mouse gets the cheese. Second worm gets to live.
Yeah.
Right.
Yeah.
Yeah.
Right.
So, so George, just as a follow on tothat, having overseen many companies from
(29:02):
genomics and from sequencing and genomictechnologies, and now increasingly with
AI, is there a difference you see in thedecision making of when to start a new
company in genomics in the context ofyour involvement in companies there versus
now with the more AI focused companies orthe, the principles more or less the same?
(29:22):
Well, there's some verysignificant differences.
Well, first of all, when I started backin the 80s, I was basically just the
way I was deciding was if a investor or acolleague came forward and said, hey, we'd
like to collaborate with you on a company.
And I'd say, okay.
It was fairly reactive, but recentlyit's been mostly postdoc driven.
(29:43):
And during that time, we transitionedfrom mostly analytic to mostly synthetic.
And the problem with alot of the DNA sequencing
scenarios was you had to convince peopleto get sequenced while with the synthetic
biology, which is basically, you know,in this case is basically pharmacology.
People are already consuming drugsand you're just making better
(30:06):
and better drugs for diseasesfor which there were no drugs.
And it's just a, it's an easier pipeline.
While I would say that our firstsequencing innovation was in 1984.
And here we are in almost 2024.
And we still don't have, you know,40 years later, we still don't really
have consensus enough that health careproviders think that it's a good thing
(30:28):
to give everybody their whole genomesequence, uh, or, or act on it in any way.
And most people don't do it on their own.
So there's a disconnect.
I think it's the 1% dilemma.
It's the seatbelts, smoking,global warming, and getting
your genome sequenced.
They're all like, hey, I got a99% chance of doing okay.
(30:50):
If I went to Las Vegas withthose odds, I'd be fine.
But it's different for public health.
And it takes special effort toget people to stop smoking and
to wear seatbelts and so forth.
So I think that's what's going on here.
And no, no government agency hasstepped forward to do what they did
for seatbelts and smoking, which wasa whole, whole series of experiments.
(31:12):
Like just passing a law to buckleyour seatbelt wasn't enough.
Getting them present inevery car was not enough.
They had to actually mandate a circuitthat would close once you buckled it.
On top of your belly, not underneathyour, you know, that hasn't happened
yet for, for Joe. Even though the, thecarrier status alone, plus adult-onset
(31:33):
diseases could save us a trillion dollarsa year and a lot of pain and suffering.
It just hasn't happened yet.
But with synthetic biology,totally different thing.
There's the orphan drug act.
That makes it very profitable togo after rare things, and then
there's lots of common diseasesas well that can be treated.
And my favorite treatment is genetherapy, for reasons that we could
(31:55):
go into if you, if you want to.
Raj, shall we go to the lightning round?
Sure, so George, we'd like todo a quick lightning round,
if that's alright with you.
We're gonna ask a series of questions,and the rules of the game are one to two
sentence responses, uh, to each question.
Yes or no is, is also great.
Does that sound alright?
(32:17):
And some of these will behighly entropic questions.
And the goal is to learn more aboutGeorge Church and how he thinks about
the world, but you have to be brief.
Um, so, uh, the first question iskind of a Turing test for biology
is the way that I think about it.
And an appropriate responseto this could be that's a dumb
question, but, will AI understandbiology in any meaningful sense?
(32:41):
Where understand is in air quotes here.
I think in a way it already does.
Highly advanced biotechnologists talk toeach other essentially in biotechnology,
which is not really natural language.
And computers, I gave anexample for protein design.
(33:01):
Alright.
George, what's yourfavorite piece of music?
Favorite piece of music...
Gee, you know, I kindof like Talking Heads.
Oh, nice.
You know, "this is notmy beautiful house."
I don't know.
Given a little more time, Icould go with a few others.
(33:22):
Yeah, that's the point of thisthough, is to inject some entropy.
So, um, uh, you've had a storiedcareer in science, um, what one thing
has changed the most either from atechnology standpoint, from a society
standpoint, from a political standpointin science over the course of your career?
Well, I would say all of the above inthe Genome Project, because it, the
(33:46):
NIH was entirely hypothesis driven.
Now, you've got one institute thatis discovery driven and another one
that's engineering driven, and thatcoincided roughly with a twofold
increase in the NIH budget while wewere starting the Genome Project.
I can't say it was cause and effect,but it was, it was a nice coincidence.
(34:07):
Do biologists need to understandmachine learning to contribute
to machine learning projects?
Do citizens have to understandGPS and atomic clocks in order to
find directions on Google Maps?
Alright.
We'll accept it.
One sentence.
We accept it.
As we know, the price of thingsin health care tends to be sticky,
(34:30):
so the question is, will machinelearning ultimately reduce
costs for diagnostics and drugs?
Yes, is the short answer, uh, but I don't,and I'll give an example where it's not
sticky depending on how you define things.
So, gene therapy was 2.8
million dollars a dose.
Until we got to COVID-19 and the topfive vaccines were all formulated as gene
(34:55):
therapy, some as low as $2 a dose, so
2.8 to 2, so, youknow, that's not very sticky.
Okay.
Are preprints a net scientific good?
Preprints, I think net is the key word,you know, they're, they're good and bad,
pretty high levels of good and bad, butyeah, I think they're net positive.
(35:17):
Okay.
Final question of the lightning round.
If you could have dinner with oneperson dead or alive, who would
George Church have dinner with?
Oh, geez, I think probably NettieStevens, you've probably never heard of
her, but she was on the little Googlelogo, but I knew her before that.
In around 1910, 1915, she found thechromosome theory of inheritance along
(35:41):
with Morgan, but uh, or separatefrom Morgan, about the same time.
Alright.
Alright,
so, we're going to move to thefinal segment of the episode.
We're going to talk aboutsome big picture things.
I think it's fair to say we'vealready touched on some big
picture topic so far, but we'regoing to try and broaden the
aperture just a little bit further.
We've talked a lot about biotechnologyand your work in the area.
(36:03):
I want to come back a little bitto a clinical focus and given sort
of what you see happening either indiagnostics or gene therapy, what
medical specialties do you think are mostlikely to be changed and impacted by AI?
Medical specialties, hopefully genetics,interpreting the genome is increasingly
(36:24):
engaging polygenic risk scores, andI think that probably could be done
better, and then that could have impacton almost every field of medicine.
The other thing is age-related diseases.
I think there there's an opportunity ofhaving multiple genes involved in gene
therapy and possibly even personalizedor personalized medicine in general,
(36:48):
but especially related to agingbecause aging affects every disease.
Basically, almost every form of humanmorbidity and mortality is impacted.
So, I think those are a cluster of threethings that interact with one another.
Genetics, aging, and machine learning.
So, thinking about our listeners whoare clinicians and in particular
(37:11):
early career clinicians, med students,residents, what do you think those
folks should know about AI to helpthem prepare for a career in medicine?
I think that they can have afairly high-level view of it.
It's like we no longer, most of usdon't program in zeros and ones.
We program with high-levellanguages like Python or maybe
(37:35):
even HTML or Excel or something.
So, it'll be like that.
Hopefully, it'll be veryeasy to interface with.
This is the case for mostreally awesome software.
But they will have to know it.
And they might not haveto memorize as much.
When I was a boy, you know, we hadto memorize all the biochemical
pathways and all the pathologies.
(37:55):
And, hopefully, it'll be like,how do you look for it?
How do you interface with the machinelearning and the big databases?
So, you can't know it
all anymore, but how do you knowwhere to look for the answer?
There's not going to be a Krebs cycleof AI for, for physicians to memorize.
Hopefully not.
(38:16):
On the contrary, it's going to meanfewer people learning Krebs cycle.
Excellent.
And we'll learn instead, oh, uh,isocitrate dehydrogenase is very
impactful on certain gliomas, right?
And so it's one of the mosttreatable of the, of the otherwise
nasty category of cancerthat hits the brain.
(38:40):
Yeah, I'm married to a clinician and Ithink that the mandate that I've been
given is that if I develop anything inthe AI space that makes her job more
difficult, if I have a new Krebs cyclethat she has to memorize, then that thing
is not going to get very far clinically.
I saw it, yeah.
Yeah, there's, there's in this dayand age there's really no excuse for
(39:00):
poor user interface in, in computing.
Or another checkbox that you have toclick, or something like that, exactly.
That's true, there's plenty ofbad software nevertheless, but
there's no excuse for it, yeah.
So, I'm glad that I get to askthis question, because I'm sure
that we'll get a great answer.
What is your most controversial opinion?
(39:24):
My most controversial opinionswere placed on me, not
from me.
Okay.
So like things having to do with advocate.
I don't have to precede this, but I don'tadvocate, but some people felt that I was
advocating cloning human Neanderthals.
So, uh, a controversial opinion thatis yours, not attributed to you.
(39:47):
Or this line is probably thateverybody should seriously consider
getting their genome sequence.
In particular, if they'reof reproductive age.
You know, let's say 16 and up, especiallyfor men that keep going, that they
should know their carrier status.
Uh, and that could influence whothey date or various other things.
(40:09):
I think the idea of a dating appthat is aware of your carrier status
is the most humane place to do it.
But the controversy is they eitherthink that that's eugenics, which it
isn't, or it's controversial becauseyou don't want to de-romanticize
something by being so technical.
(40:29):
But anyway, I think that'sa huge missed opportunity.
It's more humane because I'm gladwe're not in the lightning round.
It's more humane because if you do itafter you're pregnant, then you have tough
decision about termination of pregnancy,which is tough for essentially everybody
(40:50):
where you're pro-choice or pro-life.
And if you do it after you're married,then you've got the tough decision.
Are you going to have children withthis person, which means you're
going to do in vitro fertilization,which is no walk in the park.
And so that's bad news.
But if you do it before you've evenmet the person, then it means that
out of a thousand people you coulddate, you're gonna, you know,
(41:11):
date 990 of them, and at most.
And so you can eliminate a few,and there's no false positive
problems at that point.
There's definitely a false positiveproblem if you're doing IVF or, uh,
termination or, or worse yet, youknow, doing a surgery to remove organs
that might be at risk for cancer.
(41:32):
But there's essentially no false positiveproblem when you're rejecting 3%
of the potential suitors. And at the risk of misattributinganother quote to you, I think I've heard
you talk about this before and it wouldgo something like, there's a dating app,
like you said, and silently behind thescenes, you're getting screened out from
people who have the same carrier statusas you so that you're never matched with
(41:54):
someone who would, so that two recessivegenes would, would come together.
Right?
So you'd, you'd kind of not even know,that it was going on behind the scenes.
You would just be matched with peoplethat you would not have this problem with.
That's right.
That's right.
That's accurate and, and that's humane inaddition to all the things I mentioned
That's humane because another awkwardtime is you've decided to marry somebody
(42:16):
and then you get the score and thenyou decide not to do this. Or if that
matter you get people find out, you know,you're all ready to get married and then
the marriage is off. And then everybodyknows that both of you are carriers and
so, in a certain sense, maybe in lessaccepting parts of society, you both get
branded as why should anybody date them?
(42:38):
And in fact, everybody, you know, 97%of people should date them, just
not the 3% that are mismatched.
And so, by avoiding anybody knowingthat your carrier status, I
think it's the most humane thing.
So, both more humane and noscarlet letter, no stigma, like
none of, none of those issues.
Yeah.
(42:59):
Right, so I think that's the time to doit, or we could destigmatize everything,
but that's, I think that's harder.
It's hard to say which is harder,but it could be harder, yeah.
George, are there any examples of a, Iguess, I think you called it a dating app.
Are there any examples of that inplace, or is this, is just an idea?
There are.
Not exactly that, not an app, I meanit's not quite, but it's like that,
(43:20):
which is Doria Sharim, which wasstarted by a rabbi, I think mid-80s.
So, it's been around for a while, becausehe had, I think, four of his children had
Tay-Sachs, which is a serious disease,kills kids painfully at age four-ish.
And he just decided thatthere should at least be the
option among his congregation.
(43:44):
It's anybody that couldhave similar afflictions.
And it's scaled up to, I think, eightor nine genes, typically, that are
enriched in the Ashkenazi population.
But in a certain sense, we're allat risk for those eight or nine
genes and about a thousand more.
And it's not clear why it hasn't spread.
It's been very successful inthe populations that use it.
(44:06):
where it's lowered the risk of suchbirths by at least a factor of 10,
and why it hasn't spread to otherpopulations is, I don't think it's
because one population knows moreor less science than the other ones.
It's something else.
It's not that one population hasnecessarily more genetic diseases.
It is true that some inbredpopulations have slightly higher,
(44:30):
but that's not -- the point is, we'reall at least a 3% risk.
So, our final question for you,George, and you can take this in
multiple possible ways, it's up to you.
What applications of AI tobiology keep you up at night?
Oh, yeah.
Well, first of all, I'mgenetically narcoleptic, and so
(44:53):
nothing keeps me up at night.
It's about 30
seconds is the mediantime to falling asleep.
But what keeps me up during the day,is anything involving discrimination.
So artificial intelligence could more.
So, it's an interesting question.
When we worry about discrimination,are we worried about it
(45:13):
being too inaccurate?
In other words, we're, we're stereotypingan entire people category of people that
maybe have a a priori a low probabilityof living up to the stereotype.
Or are we worried it's too accurate?
Is it, or are we worried that it's notaccurate enough or it's too accurate?
And I think it's case by case.
But in any case, it could haveenough imprimatur of accuracy that
(45:38):
it would be used, but still
inaccurate enough that it could be abused.
So that's one scenario.
The other scenario is they could use it,
you know, make personalized weapons,you know, once, once the terminators
come, then our human failures couldbe in a certain sense public to the
machines, even though they're not public.
In other words, I can't tell it,but they could figure it out.
(46:03):
I guess a follow-up question to thatis I know like your work in gene
editing, you spend a lot of timethinking about how when this technology
becomes democratized and you canbuy like a, a reagent kit for $10.
Like how do we, what are a set ofethics and what are a set of protocols
that we can use in a world like that?
Is that at all similar to howyou think about what's happening
(46:23):
with AI either generally or AI inbiology because a model that cost 10
million to create today will, you know,cost $10 to create five years from now.
So, how do we think about thesepowerful technologies that
are also being democratized atlike a very, very quick pace?
How can we sort of balance safetyand progress in that kind of world?
(46:47):
Well, we've demonstrated that we, thatthere's no such thing as a slippery slope.
That is to say, there are documentedcases where we were able to keep
ourselves off the slippery slope andother cases where we are, we're not,
at least not for the whole population.
There's always some percentage of thepopulation that falls into the trap.
So, for example, speedlimits, there is no magic.
(47:11):
There's a point where suddenly itbecomes unsafe, but people tend to
stay pretty close to the speed limits.
Yeah, I guess the question was abouteither professional societal norms,
so I know that there's been a lotof this in gene editing where there
are groups that meet to discusssafety and come up with regulation.
Is there any of that lessons fromthat community that transport to
(47:37):
AI generally or AI in biology?
Right, so, you don't need specialgroups to monitor gene editing.
Uh, you have the FDA.
And the FDA is very effective at keepingus bringing things out that are
safe and effective.
That also applies to medicaldevices, so it is possible the
(47:58):
AI would fall in that category.
But it's also possible you canevade the category by making
something that's not recognizable.
For example, dating apps thatdon't seem to be regulated by the
FDA, even though they could have atrillion-dollar impact on medicine.
In a certain sense, they're not medicine.
But, nevertheless,
so even though people try to makeregulations on top of regulations, so
(48:22):
they wanted to have a moratorium ongene editing on top of the FDA, which
has a moratorium on all new drugs,I thought that was a little crazy,
redundant, and it didn't really happen.
But with AI, if it does slipbetween the cracks, then there
should be some kind of safety.
Now, our track record for thatis not so great in computing.
(48:43):
If you look at the Internet,there was very little
of the foresight that existed, verylittle of it made it in, in time.
So, there's wide opendoors for hacking, for
computer viruses, for identitytheft, for abuse of children,
and pornography, and so forth.
So, we didn't do such a great job there,and I hope we do a better job with AI.
(49:07):
Partly because of fantastic educationalmedia, by which I mean the Terminator.
Alright, uh, maybe we'll edit itso that we don't end on a dour
note like that, but I think that,uh, yeah, that would be better.
Ending on the Terminator, great.
But I can't remember the last time I'vehad a conversation where we discussed
(49:29):
resuscitating woolly mammoths and datingapps within the same hour, so it's been,
it's been a really special conversation.
Thank you so much George for being on
AI Grand Rounds.
It's been a pleasure.
Thank you.
It was great.
I look forward to hearing it.
Yeah.
Great, thanks.