Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
(gentle bright music)
- Welcome to the "Lessonsfrom Lab and Life" podcast
brought to you by New England Biolabs.
I'm your host Lydia Morrison.
I hope this episode bringsyou some new perspective.
Today, I'm joined by Vladimir Potapov,
a bioinformaticist and12-year NEB employee,
(00:21):
and an important part of theteam that builds online tools
to aid both NEB scientists
and our customers in calculationsand experimental design.
(gentle bright music)
Vladimir, thanks so much forbeing here with me today.
- Oh, hi, Lydia. It's good to be here.
- Could you introduceyourself to our listeners
and explain what it is that you do here
(00:43):
at New England Biolabs?
- My name is Vladimir Potapov.
I'm a research scientists.
I'm in the research bioinformatics group.
So at NEB, we have alarge research department
with various scientific divisions,
and part of the role of theresearch bioinformatics group
is to interact with scientistswith the research department
and help them analyzetheir data and experiments
(01:04):
and conduct our own research.
- And NEB offers a varietyof open-access tools
that are available online tosupport life science research.
Could you tell us aboutthe tools that NEB creates
and makes available for the public to use?
- Oh, yes.
There was like a whole set oftools aimed at different users
at different aspects of the work.
(01:27):
So for example,
you can go through theproduct selection tools
that help you choose a particular product.
Or for example,
we have a set for therestriction enzyme tools
where you can find the optimal enzyme
for your particular experiment,
or you can find the buffer
that is optimal for those enzymes.
We also develop tools for other people
to, for example, do DNA assembly
(01:47):
using NEBuilder HiFiassembly, or Gibson assembly,
or golden gate assembly.
We also help people to designprimers for metagenesis,
and different calculators justto do different conversion
between RNA, DNA,
and different physical chemical properties
of those molecules.
- So really, a wide variety of tools
that range from everythingfrom helping make calculations,
(02:11):
to helping design primers,to full experimental design?
- The goal is to simplifywork of other users,
either to analyze their dataor to design their experiments.
And some of those toolswe are using internally
in the research
and when it's relevant to people outside.
(02:33):
So we provide also thetools for the customers.
- Yeah. These are really valuable tools.
And I know our customers reallyappreciate all the effort
that you and the otherbioinformaticists and software
developers at NEB have putinto creating these tools.
- Exactly.
- So you were integral in the development
of our NEBridge Ligase Fidelity Tools,
(02:55):
which are used for aiding in the design
of golden gate assemblies.
Could you share with us astory about how that suite
of Ligase Fidelity Tools came about?
- There's an interesting storybehind Ligase Fidelity Tools.
Actually, that goes back tothe beginning of what I said,
that, you know, at NEB, we havea large research department
and we conduct a lot of researchin purely basic research
(03:18):
and applied research.
And part of that research,
we are trying to understandproperties of different enzymes.
For example, T4 DNA Ligase.
As part of that particular project,
we are trying to understandmismatched ligations
of overhangs that can becarried out by T4 DNA Ligase.
And those designed alarge experimental study,
and from that experimental study,
(03:38):
we got a lot of data on biases
and preferences of T4 DNA Ligase
in ligating those overhangs.
And you know, as a scientist,we became sort of interested.
You know, sometimes we have aparticular pair of overhangs
and we're trying tounderstand how these overhangs
are going to be treated by T4 DNA Ligase.
And originally, we hada large spreadsheet,
(04:00):
and we would go and checkevery individual overhang,
but bear in mind that is somewhat tedious.
So we created a tool
that you can providea set of the overhangs
and the program will automaticallyextract that information
from our experimental data.
And we quickly realized the value of that
because it's very easy allow us to see
how good overhangs are.
Are they compatible?
Can they use in golden gate?
(04:21):
So to summarize it,
you know, it comes from ourinternal research program
where we're trying to understandproperties of T4 DNA Ligase
in the context of overhang ligation.
- Yeah, I think that's areally powerful example
of how the research,
the basic scientificresearch that we're doing
at New England Biolabs canproduce these large datasets
(04:44):
that can really help ourcustomers inform decisions
about what overhangs are best for them,
or maybe what primers are best for them,
or what enzymes are bestfor their experiments.
So you talked about theNEBridge Ligase FIdelity Viewer,
are there other tools
in the NEBridge Ligase Fidelity toolset
(05:06):
that you could tell us about?
- Yes. Actually, there aretwo complementary tools.
And these two complementary tools
address a slightly different aspect
of the golden gate assembly workflow.
There are a lot of people at NEB
and people outside that, youknow, they have modular parts
that they would like to assemble together.
And what they need to knowwhich particular overhangs
will be good to bringthose parts together.
(05:29):
In that particular case,I say, for example,
"I need a set of the five overhangs
which are optimal forcreating that construct."
So for this, we developed a tool
which is called NEBridge GetSet Tool.
So essentially, the user can say
that particular enzymeexperimental conditions.
For example, I would like to say,
"I would like to have a set of five,
or 10, or 15 overhangs."
(05:49):
And that tool willautomatically pick the best set
of overhangs for a givenexperimental condition.
So the other aspect of that workflow,
sometimes people are workingalready with an existing,
for example, let's say,large nucleotide sequence
and they would like to find a way to split
that large nucleotide sequence
into a set of smaller fragments.
(06:10):
And they would like to find optimal points
within that nucleotidesequence that can be used
for generating overhangs.
One of the nice features ofthe golden gate assembly,
it's a scar-less assembly
so you can reassemble yourconstruct without leaving marks,
but you need to findthose optimal overhangs
in the sequence.
And for this particular task,
we developed NEBridge SplitSet Tool
(06:31):
where the use can introduce aparticular nucleotide sequence
and indicate approximate regions
where he would like tointroduce the cut sites.
And the tool will automaticallyoptimize the cut points
based on the specifiedexperimental conditions.
- How do I see thealignment of these overhangs
and these recommendationsfrom the GetSet tool?
(06:52):
How do I see that affecting the accuracy
of the final assembly or thesuccess of the final assembly?
- Yes, the overall goal,
for example, a techniquelike golden gate assembly,
is to assemble a largerconstruct from smaller pieces.
And ultimately, the successof the golden gate assembly
depends on the set ofthe chosen overhangs.
(07:13):
And those tools are designed to automate
and help users to choose those overhangs
based on the experimental that was derived
during the course of theexperimental studies.
- So really, it cantake a lot of the trial
and error of finding theseoptimal combinations of overhangs
(07:35):
out of the experimentaldesign for researchers?
- Yes, that's exactly the point:
how to pick the optimalset in a user-friendly way.
- So you've mentioned a couple of tools
in the NEBridge Ligase Fidelity toolset.
Are there any other tools thatare available within there?
(07:56):
- So when we need thosetools for outside users,
external users, actuallythere was a lot of interest.
And many people, you know,when they go and use our tools,
they find because if youwork with one sequence,
two sequences.
But really quickly,
we started getting the request from people
who would like to do it fora lot of sequences at once,
and we were thinking abouthow to enable those users
(08:19):
to perform this task.
For these, we developed aparticular set of the tools
which we call NEBridge SplitSet Lite API.
Essentially, API is anapplication program interface,
and it's a programmatic way for users
to automate their tasks.
So the people who arefamiliar with the programming,
they can use our API
(08:39):
and do a batch analysisof their sequences.
There are people
who don't want to usea programmatic access,
but they still would liketo analyze many sequences.
For this particularsituation, we developed a tool
which is called NEBridgeSplitSet Lite High Throughput,
where the people can providethe list of the sequences
in various formats.
And the tool has a nicegraphical user interface
(09:02):
that allows them toaccomplish the same task
without relying on aprogramming interface.
Also, as an option, we providethe overhang optimizer code.
That code was originallyused in our internal research
to develop all those tools.
And we also make itavailable to other people
who would like to maybe take that code
and run it internally oradjust to their own needs.
(09:27):
- Wow, I actually had noidea that we share so much
of the build informationwith our customers,
and we really put that APIinterface in their hands as well
to allow them to use thetools to the best capability
for their particular use case.
(09:48):
How many sequences are we talking about
when we're thinking abouthigh throughput users?
- Well, realistically, it can be hundreds,
thousands sequences.
It mainly depends on the time it takes
to run the calculations.
But for a set of about 100,000 sequences,
that's within seconds to minutes.
- Wow, so a really powerful, fast tool
(10:11):
to enable folks to quicklylook through their data
and see what's successful
and what might improveoutcomes for their experiments.
What are some of the challenges
that you've faced indeveloping these tools?
- As a bioinformatics scientistin the research department,
you know, we develop tools internally
(10:34):
and externally toaddress specific problems
and usually, the way we develop the tools
to help conduct the research.
When we conduct the research,the goal can be shifting
because you're trying todevelop a certain essay.
And when you start working on it,
you realize that you need someimprovements, adjustments.
When you change the experiment,
(10:54):
usually you have to adjustyour analysis workflow.
So essentially, when you thinkabout the resulting software,
it's a history of followingthe research projects.
And depending on howthe research is going,
it might be necessary totake several iterations
through development and analysis pipeline.
(11:14):
- And how long does it taketo develop these tools?
How long did it take
to develop the NEBridgeLigase Fidelity toolset?
- I mean, it's part of thelongterm research program at NEB,
so, you know, the projecthas several stages.
One stage is collecting the data.
The second stage is to understanding
(11:35):
how the data is organizedand how to process the data.
Depending on particular project,
it can take from weeks to months.
- That's actually a lot fasterthan I thought it would take.
Probably the gatheringof the empirical data
is one of the longest factors, yes?
Or is it figuring out how to sort through
(11:56):
and organize the datato make it make sense?
- I would say it'salways project-dependent.
Sometimes experimentalpart is a difficult part.
Sometimes computationalpart is difficult part.
But usually, it alwaysinvolves several iterations.
And once you start analyzing your data,
you start to findsomething you didn't expect
(12:16):
that prompts you to eitherchange your experiment
or change the analysis workflow.
- I know that I've been usingNEB's tools for a long time,
starting from when Iwas in graduate school
and I would consider the catalog a tool.
I think with the adventof our online tools,
starting with, you know, NEBioCalculator
(12:39):
and the Enzyme Finder,
all the way to the newest toolsthat we've released today,
including the NEBridgeLigase Fidelity Tools,
these are incredibly valuable assets
to researchers designingand performing experiments.
And I just think a huge kudosshould go out to your team
(13:00):
for making all the tools
from the inside research atNEB available to our customers
in an open access sort of way
where, you know, we're notjust sharing a magic black box
where they're putting in theirsequence, and their primers,
and their annealingtemperature, or whatever,
and we're spitting outan answer from them.
(13:20):
But having the API available,having the code available
for individuals who have that skillset
to be able to integrateit into their workflows
and have it be the mostpowerful tool it can be
in their hands,
I just think it's prettyincredible work that you
and your team do.
So I wanted to thank you somuch for being here today
to share the details of it with us.
(13:42):
- Thank you.
(gentle bright music)
- Thank you for joiningus for this episode
of the "Lessons fromLab and Life" podcast.
Please check out our show'stranscript for helpful links
from today's conversation.
And as always, we invite youto join us for our next episode
when I'm joined by threeincredible young scientists
(14:02):
whose work was foundationalto the development
of RNA vaccines.
Olubukola Abiona, Geoff Hutchinson,
and Dr. Cynthia Ziwawoshare what it was like
to be on the front lines of the develop
for the first COVID vaccine.