Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Jerod (00:04):
Welcome to the Practical
AI podcast, where we break down
the real world applications ofartificial intelligence and how
it's shaping the way we live,work, and create. Our goal is to
help make AI technologypractical, productive, and
accessible to everyone. Whetheryou're a developer, business
leader, or just curious aboutthe tech behind the buzz, you're
(00:24):
in the right place. Be sure toconnect with us on LinkedIn, X,
or Blue Sky to stay up to datewith episode drops, behind the
scenes content, and AI insights.You can learn more at
practicalai.fm.
Now, onto the show.
Daniel Whitenack (00:49):
Welcome to the
Practical AI podcast. This is
Daniel Wightnack. I am CEO atPrediction Guard, and I'm joined
as always by my cohost, ChrisBenson, who is a principal AI
research engineer at LockheedMartin. How are you doing,
Chris?
Chris Benson (01:04):
I am doing just
fine. It's been a good day, good
fall, and lots of cool thingshappening to talk about.
Daniel Whitenack (01:12):
Yeah. Yeah.
I'm I'm excited to I'm excited
for today's discussion. AlthoughI have to say I I feel a bit
outnumbered by the Chris's, butthere's some cool Chris's on on
the show today, includingyourself, Chris Benson. But
we've also got we've also gotwith us, Chris Aquino, who is a
software engineer atThunderbird.
(01:32):
Welcome. We won't call you Chrisb. Actually, your last name
starts with an a, so maybeyou're Chris a and Chris Benson
is Chris b, and that just worksout because his name starts with
b.
Chris Aquino (01:44):
That's perfect.
Daniel Whitenack (01:44):
Yeah.
Chris Aquino (01:46):
Hello. Hello.
Thank you for having me. I know
that we had some reschedulingissues early on, we're here now.
Daniel Whitenack (01:53):
We're here now
and glad we are, yeah, because
we've had a few guests fromMozilla or projects that Mozilla
has been involved with for sometime or a couple times in the
past and it's always greatdiscussions and of course love
the perspective that Mozillabrings but also projects like
Thunderbird. Could you give usmaybe just starting out a little
(02:17):
bit about your personalbackground and kind of how that
eventually led you into work onThunderbird?
Chris Aquino (02:24):
Yeah, so my
personal background, I've been a
web developer since, oh mygoodness, two decades. Let's go
with two decades ago is when Istarted, and I worked for
various companies that diddifferent things. I've been, in
addition to a web developer,I've done teaching and
(02:45):
authoring. Most recently, well,to Thunderbird, I was at
SurveyMonkey for a little while,and then the great layoffs of
2022 and 2023 hit. I was one ofthe newer engineers, so I got
cut.
And as I was applying for jobs,was like, you know, I've always
wanted to work for Mozilla.Let's just let's just see what
(03:07):
their job board looked like. Andthat's when I submitted an
application and got word backfrom somebody who was clearly
not a recruiter. The director ofproduct emailed me, you know,
just emailed me and said, Hey,I'd love to talk with you. And
yeah, that was I initially gothired on to build just weird
(03:29):
stuff that was outside of therealm of the Thunderbird
desktop.
He his name is Ryan, Ryan Sipes.He's always got something
cooking. He had some interestingideas for a set of products that
he wanted to explore. And sohe's like, yeah, could you would
you be interested in joining usand and working on some of this
(03:51):
of this weird stuff? And so Isaid yes.
Chris Benson (03:54):
As a quick
follow-up on that, as you're
talking a little just for theguests that may not be familiar
with Mozilla and the context ofThunderbird within Mozilla,
could you talk a little bitabout that for a moment just to
set the stage?
Chris Aquino (04:07):
Yes, I will do my
best. It is a little convoluted,
as Wikipedia can tell you. SoThunderbird, all right, you can
think of Thunderbird Well, firstof all, for those who don't
know, for the kids out there whoare like, What is email?
Thunderbird is a desktop emailclient. And for a certain
generation, those three wordsstrung together means nothing.
(04:27):
But it is a twenty plus year oldopen source project which
originated at the company nowknown as Mozilla. If you're
familiar with the Firefoxbrowser, that is made by our
sister company, the MozillaCorporation. And Thunderbird and
the Mozilla Corporation all fallwithin, I guess you could say,
(04:49):
the guidance of the MozillaFoundation, which is a
nonprofit. And as of, oh, Iguess it was five ish years ago,
maybe a little bit longer thanthat, Mozilla was nobody was
maintaining Thunderbird. So thisapplication that was not Firefox
(05:10):
had you know, they didn't wantto deal with the maintenance
that was you know, engineers.
Engineers need to maintain thisalmost twenty year old C plus
plus code base. And that wasfor, you know, management
reasons. They're like, okay, wecould just hand this to the
community or we could just shutit down. And the director of
(05:31):
product here at who's now thedirector of product, Ryan said,
Wait a minute. Listen, give me ashot.
Let try something out. Let mesee if I can turn this into
something more than it is now.And they did whatever legal
things and paperwork wasnecessary to spin it off into
its own entity. We are known asThunderbird or the Thunderbird
(05:53):
Project, but if you look us upon the internet, we're
officially MZLA, which sometimesI feel like it's one of those
acronyms that doesn't reallymean anything. But that's what
it says on my resume.
So that's Thunderbird in anutshell.
Daniel Whitenack (06:08):
Now, yeah, you
mentioned this kind of history,
this sort of desktop email,application history. Just for
context, I'm sure there's manypeople listening that do
remember those sorts of days,but but others that that might
have just always used Gmail in abrowser or something like that.
(06:29):
But there's a whole generationof us that use something,
whether it's Outlook, a lot ofpeople have used. I remember
installing Ubuntu for the firsttime earlier in my career and
using Thunderbird in thatcontext on my computer. Now you
(06:49):
mentioned there's kind of maybea different, a more vision for
this now.
Any context you could providethere in terms of the way people
use email now versus those dayswhen Thunderbird was being used
as a desktop application. Whatis kind of the, I guess, the
(07:09):
focus and transition, if youwill, or at least some of the
things that are being thoughtabout?
Chris Aquino (07:14):
Yeah. I'll start
with sort of the ethos, like why
is Thunderbird still around?There are still people who use
email. It works perfectly well.But unlike using, say, a webmail
based provider, these folks arenot interested in having ads
(07:35):
sort of injected for them, andmaybe they want to be able to
opt in to AI.
They don't want an AI just likeever present and reading their
emails. So with Thunderbird, itis it's free and open source. No
ads ever, forever. And you know,your email is your own. It gets
downloaded to your computer.
So if you lost internet access,you can still reach your email.
(07:58):
If you lost power and yourlaptop is your laptop is open to
Thunderbird, you've still gotit, right? All those emails that
you downloaded, they're rightthere. So I think that the way
that email is being used, it'sdifferent. It's different now.
One of the great things aboutThunderbird back in the early
days, and it's still true today,is it's very easy to manage
(08:21):
multiple email accounts. Thatwas, you know, I remember a time
when like, oh, you have morethan one email address? Amazing.
You must be really important.And now you can just sign up for
email addresses, dozens of themif you want, and, you know, have
them each dedicated to adifferent purpose.
You can still do that withThunderbird. However, one of the
(08:44):
things that people areencountering now, as we all are,
is a certain amount ofinformation overload, right?
You're subscribed to so manynewsletters and mailing lists
and you you know, you're workingon some collaborating on some
side project and then you've gotyour your main work email. How
(09:06):
do you how do you read? How doyou read all that email?
How do you deal with all of it?So those are the sorts of things
that we're thinking about now ishow do we empower the user to
better deal with just this hugeinflux of information that
they're getting every day, everyhour?
Daniel Whitenack (09:23):
So first off,
thank you for validating my
multiple email use. Every oncein a while, my wife gives me a
hard time because I generate anew email every once in a while
when I get frustrated with myemail feed or have a specific
purpose and it seems like no oneknows, no one ever knows which
(09:46):
email to email me at, whichmaybe is a strategy in and of
itself. I'm not sure. So thanksfor validating that. But yeah, I
know that there's a good amountof kind of AI intersection with
email, whether that be from theweb email side, so in Gmail or
(10:10):
other things with Gemini, oremail clients that maybe are
specifically geared towards AIfeatures like a superhuman or
these sorts of things.
Not asking you to comment onevery single one of those, but
maybe in general, how from yourperspective have you viewed this
(10:31):
gradual integration with AI? Andwhat sort of categories could
people have in their mind of thekinds of ways that AI is being
applied within email and some ofthose trade offs that maybe
you're making when you're usingthose features that of course I
know we'll talk about privacy insome of what we talk about with
(10:53):
Thunderbird. But maybe just froma perspective, would be curious
on how you categorize AI inemail could mean a lot of
things. Could you help peopleunderstand maybe some of that
landscape?
Chris Aquino (11:06):
Sure. Yeah. I
think that the two main ways
that I see I mean, in my ownGmail account, because I will
use a device that doesn't haveThunderbird on it because I am a
longtime distro hopper, maybeI'm using a distro that doesn't
have Thunderbird prepackaged,and I'm just trying things out.
(11:26):
I have found that the automaticsummarization is a thing that,
you know, Gmail's like, Hey,Gemini can do this for you. Or
all of the autocomplete that ittries to helpfully offer me, I
feel like it's a little creepysometimes, especially depending
on what the email is.
Like if I'm talking to my doctorover email and it's like,
(11:49):
clearly this LLM has read thisvery private information, I'm
like, Oh, how do I turn thisoff? But I do understand that
for a lot of people, those twofeatures, they are time savers.
You have this way of compressingmore human time into your day by
offloading it to an LLM. Sothat's great. It's time saver
(12:12):
and it's great that people haveaccess to that.
Some things that you lose out onthat you're gaining time, but
what you're trading are thingslike tone, right? The tone is
for some summarization models,it will kind of strip out all
the tone. Or if it's bulksummarizing multiple emails,
(12:35):
that you know, email from yourmom doesn't really sound like
your mom. It is literally justlike your mother's coming to
town this weekend. So there'smaybe dehumanization of email is
kind of the wrong term or maybea little extreme, but that gets
essentially normalized to thetone of the LLM.
(12:56):
And yeah, to your point, theprivacy aspect, that's kind of
the big one for us here atThunderbird. We're very privacy
respecting, privacy preserving,because that a lot of our users
choose to use Thunderbirdbecause of that. They want to
manage and own their own email.They don't want they don't want
(13:19):
their personal emails harvestedfor marketing purposes or for
training data.
Sponsors (13:40):
Well, friends, when
you're building and shipping AI
products at scale, there's oneconstant, complexity. Yes.
You're bringing the models, datapipelines, deployment
infrastructure, and then someonesays, let's turn this into a
business. Cue the chaos. That'swhere Shopify steps in, whether
you're spinning up a storefrontfor your AI powered app or
(14:01):
launching a brand around thetools you built.
Shopify is the commerce platformtrusted by millions of
businesses and 10% of all USecommerce from names like
Mattel, Gymshark to foundersjust like you. With literally
hundreds of ready to usetemplates, powerful built in
marketing tools, and AI thatwrites product descriptions for
(14:21):
you, headlines, even polishesyour product photography.
Shopify doesn't just get youselling, it makes you look good
doing it. And we love it. We useit here at Changelog.
Check us outmerch.changelog.com. That's our
storefront, and it handles theheavy lifting too. Payments,
inventory, returns, shipping,
Jerod (14:41):
shipping, even global
logistics. It's like having an
ops team built into your stackto help you sell. So if you're
ready to sell, you are ready forShopify. Sign up now for your $1
per month trial and startselling
Sponsors (14:53):
today heavy at
shopify.com/practicalai. Again,
that is shopify.com/practicalai.
Daniel Whitenack (15:08):
Well, Chris,
we were starting to get into a
little bit of, I guess, theintersection of the ethos of
Thunderbird with these sorts ofAI features. Now, could you help
us understand, I guess, like wetalked about autocomplete, we
talked about summarization, forexample. There are various
(15:30):
mechanisms by which thesefeatures can be implemented in
an email client or anapplication or web application,
whatever that is, in terms ofthe actual AI model, where it
sits, how the data flows, whatthe model is trained on maybe.
Could you help us understandthat piece? What are the buffet
(15:52):
of options available to us interms of how we might, like the
integration point in the flow ofdata?
Chris Aquino (16:00):
Sure. Yeah, we
have discussed at length
different ways that we couldapproach this. So let me let me
begin by saying that thisexperimental work in bringing an
AI assistant to Thunderbird.This is not baked into
Thunderbird. We're not going toturn it on for users.
(16:21):
It's not going to be automaticor anything like that. Instead,
what we've done is we have builtit as I'm going to call it sort
of like companion for right now.We'll put a pin in that. We'll
return to that because there's alot of decisions we had to make
because of that approach. Now,our options that were available
(16:43):
to us, like, we could just dolike, add model inference to
Thunderbird itself.
This is, in my opinion, like,yeah, but now this is us turning
it on for all users. It's like,just add the model in there and
allow us to do inferencelocally. It's private, right?
And that's great. However, we'renot so invested in this idea
(17:07):
that we want to put that on thedesktop team's roadmap.
When I say the desktop team,just a little plug for the
mobile team at Thunderbird.There you can grab it for
Android now, works really well.IOS is coming soon. But as far
as the the desktop client goes,like, they already have their
(17:27):
road map and we just kinda wannarun our AI experiment parallel
to that. Okay, so that's oneoption.
A second option is like, well,could we run inference in a
separate application alsolocally? Yes, we can. And we've
(17:47):
kind of poked around with that,but we don't necessarily if we
want to roll this out, we don'tnecessarily want to require
users to download a secondapplication. Okay. So well, how
do we split the difference here?
Alright. So what if we what ifwe called out to an API, like
one of the cloud providers for,you know, any of the the models
that would be good at helpingyou out with email, the the the
(18:10):
typical tasks of summarizationor reply generation? So we
decided that could be a thingbecause it doesn't require
installing anything, you know,heavyweight and additional, but
that brings up a separateproblem of like, where where are
you sending the email data to?And I'm happy I will be talking
more at length about that.
Daniel Whitenack (18:31):
Yeah. I think
in all my discussions in my day
job work, that's often what itcomes down to is we would love
your AI features, where is thedata going? So I definitely
understand that. And I assumethat there's definitely this
(18:53):
tension of doing things locallyand putting that on, like you
said, the desktop roadmap. Butalso there would potentially be
kind of either limitations interms of the kind of model you
could use, or even just if youcould use a model that worked
(19:14):
well, but it might just destroyall of the battery of the device
on which it's running or thatsort of thing.
Were those things also part ofthat conversation?
Chris Aquino (19:25):
They definitely
were, especially regarding the
second application that just rana model inference process in the
background. I don't know aboutyou, but I my laptop, my work
laptop, not super fancy. Itgenerates, you know, a few
tokens per second, which is it'sjust not fast enough for this. I
(19:48):
mean, for me to work on thething that I'm working on. So
that was a big concern.
That and, you know, a lot of ourusers are on Linux and they're
running Linux because maybethey're they're continuing to
use their perfectly goodhardware from ten years ago,
which most certainly cannot doany sort of local model
(20:10):
anything. But yeah, laptopusers, I love my battery life. I
don't want to completely destroyit by trying to summarize a
batch of today's emails.
Daniel Whitenack (20:21):
Yeah, yeah,
that makes sense. So you
mentioned this idea of using amodel that is behind a remote
API. There's obviously aselection. We're kind of
narrowing in. There's a sort ofselection of ways that that
could happen.
And there's like variousapproaches around maintaining
(20:44):
privacy there from just using anAPI that explicitly doesn't
store certain data, at leastaccording to their terms, or
doesn't train on your data, atleast according to their terms.
There's also kind of, I knowpeople are exploring kind of
(21:05):
homomorphic encryption and allof these things to keep data
private. Then there's sort ofend to end encryption and
there's all sorts of ways thatyou could think about privacy in
that context. What were themain, I guess, the main pillars
of what were important for youall to consider? Was that where
(21:25):
data is stored at rest?
Is it the openness of models orwhether you were hosting those
or a third party was hostingthose? Was it Yeah, what were
kind of the main topics thatcame up once you kind of dipped
into that remote inference sideof things?
Chris Aquino (21:46):
That is a great
question. And I'm going to try
to condense the story down.
Daniel Whitenack (21:51):
Okay, great.
Chris Aquino (21:52):
It'll be kind of
like, you know, The Hobbit and
then the three parts of the restof the story.
Daniel Whitenack (21:58):
Sure.
Chris Aquino (21:59):
Wow, you should
just totally, like, revoke my
nerd card now for not coming upoff the top
Daniel Whitenack (22:05):
of Tell me my
ring. That's the one. Exactly. I
Chris Aquino (22:09):
just lost half of
your listeners.
Daniel Whitenack (22:10):
The Return of
the King. Yeah, you got it. I at
least, I might not know aboutthe cool encryption stuff that
you're about to talk about orwhatever that is, but at least I
have that one.
Chris Aquino (22:26):
Nice. Well, our
powers combined.
Daniel Whitenack (22:29):
The powers
combined, it takes a community.
Yes, yes.
Chris Aquino (22:34):
So let me start
this little story off with the
very earliest experiment withthis was as a Thunderbird add
on. Okay. What is Thunderbirdadd on? A Thunderbird add on,
you're familiar with and yourlisteners are probably familiar
with browser extensions. Okay.
(22:55):
So fun fact, Thunderbird underthe hood uses the Gecko
rendering, the Gecko engine fromFirefox. So we have access to
APIs that can make it possibleso that something almost exactly
like a browser extension canreside within Thunderbird. And
(23:18):
if you're not familiar withbrowser extension development,
it's basically like HTML, CSS,and JavaScript. So we started by
writing something that was Itwas the most 1990s looking web
page that was just sort of likejammed into an add on and just
displayed in a new tab inThunderbird. And it was I mean,
(23:42):
you know, it looked like anengineer built it and that's
totally fine.
But yeah, that's when we startedwith calling like we started off
with OpenAI's API and justhanding off a number of emails
to like I think it was chatGPT-four. And that did an okay
job. But that was like, okay, itdoes work. How do we build this
(24:05):
out? And then we started tryingto get better results with some
prompts tuning and whatnot.
And then for as we startedtrying to use it with more
people within Thunderbird, wefound out like these people,
their emails are sensitive. Likewe don't what do we we need to
(24:25):
do something about this. So westarted shopping around for some
sort of cloud based providerthat could give us a guarantee
of, yes, we do not store yourdata. No, we're not using it for
training. And we talked to wewere in contact with a couple of
different companies, some ofwhom just sort of sent us to
like one of the pages on theirwebsite which told us nothing,
(24:49):
useful, couldn't give us a goodguarantee.
So that's when we startedtalking to the folks at Flower
Labs. And I know that you havehad several guests from Flower
on the show. I just want to saythat they are so terrific to
work with. They took care of Imean, really? They took care of
(25:11):
all of our needs.
They moved things around ontheir own development roadmap
and gave us early access tothings like end to end
encryption and access to theirnewest product, Flower
Intelligence, which it is so forthe listeners who hadn't heard
(25:31):
of Flower or listened toprevious episodes, they're known
for their federated learningSDK, right? They build software
that sort of does learning onindividual nodes and then shares
the learnings with a centralizedserver. All very cool stuff. We
didn't need that, though. We weneeded private API or rather API
(25:57):
access to a private LLM.
What we got in addition to thatis we got a nice SDK in
TypeScript, and we also got sowe got the end to end encryption
and oh, right. They found amodel for us, and then they they
did some post training on emailsummarization. They built an
(26:18):
eval system so they could, like,fine tune. They help produce
prompts through the eval system.I mean, they've been incredible.
Chris Benson (26:28):
I'm just curious
as you talk about this, and
especially having had multiplefolks from Flower on the show in
the past, You're talking alittle bit about kind of how you
got into the collaboration, butlike, how did you how did Flower
come into the picture to beginwith for you guys? You know, how
was that connection made and andand how as you were looking at
(26:49):
that connection potential, howdid you know that that was a
good fit for this new strategythat you've been laying out?
Chris Aquino (26:57):
I have the most
boring yet magical answer to
that question. It just kind offell into our laps because
Mozilla alright. Remember I wastalking about the nonprofit
Mozilla Foundation under whichThunderbird sits? They're an
investor in Flower. And so MarkSermon connected Ryan, my boss's
(27:21):
boss's boss, with Daniel fromFlower.
And I just ended up on a Zoomwith him one day. Ryan
introduced us and said, Allright, take it away. And that's
when the collaboration began.And so that's the thing. I I
guess having a human face to gowith the company made me feel
good about those guarantees oflike, no.
We're we're literally not in thebusiness of harvesting data.
(27:44):
Like, we we're gonna set up thisinfrastructure, we literally
can't help you debug your promptbecause it's encrypted now. So
that was the fact that theythey've been so they were so
helpful at every step exceptwhen I sent some bad data. Like,
the fact that they were like, wereally wanna help you, but we
built it in a way that we can'tsee it, the data. So it was a
(28:09):
fortuitous connection, thanks tothe reach of the Mozilla
Foundation.
Daniel Whitenack (28:13):
Well, Chris,
you started kind of unveiling
some of what made thisparticular route of
experimentation useful for youall. Could you help us
understand maybe just at aslightly deeper level, like I
could, let's say, spin up amodel in VLM in a VM on GCP or
(28:39):
wherever I host things and thenconnect to that over an API.
What kind of makes the hostingof the private model within the
Flowr system? Because it's, likeyou say, it's not federated, but
there is still kind of morethere. You mentioned, of course,
there's the post training thatyou talked about.
(29:00):
But as far as the inferenceside, could you help us
understand that a little bitmore?
Chris Aquino (29:04):
Yes. So as I
mentioned, they've taken care of
all of that, which is That isthe big benefit. They've really
been a great technical partnerwhile we conducted these
experiments. And from theprivacy aspect, my coworkers no
longer have to prune their inboxand remove anything sensitive
(29:25):
before trying out what we, atthis point, we have dubbed
Thunderbird Assist. It is yourpersonal executive assistant
within Thunderbird.
That the idea anyway. Thanks tothe guarantees made by Flour, we
were then able to try out Imean, we didn't land on the
(29:47):
current model immediately. Wetried different BERTS, different
Barts, Roberta, all the varioussummarization models. And then
one day, the summaries got waybetter. And I said, What did you
do?
What is this magic? Theyswitched to one of the llamas
(30:09):
from Meta that was trained forconversation. And it just it
worked better for email content.So at that point, we stopped
thinking about the promptbecause they had squared that
away. And prior to that, theyfound a model that could do the
task very well.
So then that got to the partwhere we're like, okay, so it's
(30:32):
really great at summarizationand reply generation. The third
feature that we worked on, youknow, this is the the biggest
thing we aimed for. Because whenyou think about, like, oh, hey,
you're doing email summarizationand reply generation. That's
great. That's basically thehello world of, you know, LLMs,
summarize this text.
(30:53):
So we started working onsomething that was or we had
been working on this featurethat was it it did not work
well. It was we refer to it asthe daily brief. And it was
intended to be an executivesummary of your recent emails.
This is when I learned thedefinition of overprompting.
(31:13):
What I would do is I would takehowever many emails arrived in
the last twenty four to fortyeight hours, ship it off to the
model, and then ask it to do,oh, can you find the most
important messages, extract allof the highlights and any action
(31:34):
items and return those back tome grouped in this particular
way with links back to theoriginal emails.
I mean, the garbage that I gotback sometimes was epic. So I
then learned that, okay, what Ineed to do is I need to split
this up into multiple multiplerequests. Right? Let's like
(31:55):
let's send a batch and only askfor importance. Some of your
listeners are like, okay.
That's highly subjective, and Iwill return to that momentarily.
But when we provided the emails,we also provided the unique
message IDs from Thunderbird. Sothat way I could then use that
(32:15):
as an index, grab the originalemails again and send off a
second request, which was like,okay, so for these, I want you
to find the action items andthen take the same batch. Now
ask for highlights and crucialinformation. The formatting task
never worked well, so I havelots of feelings about that, so
(32:39):
we'll put a pin in that.
The formatting never turned outbecause that was what I wanted
as an application developer thatLLM was not capable of, right?
It's a statistical model, and Iguess it's thinking that maybe
they want the header boldedsometimes and maybe they don't.
I'm like, LLM, do what I want.And it's like, who knows? I'll
(33:02):
roll the dice.
Maybe I'll give it to you theway that you want. And so was an
important lesson was like, okay,so in this currently, you really
need to be very careful,specific, and constrained in
what you ask the model for. Andthen the problem turned into,
(33:24):
okay, so how long does it taketo make these subsequent
requests? What can weparallelize? How do we do that
effectively without burning upall of Flowr's compute in their
infrastructure?
So we started looking for waysto optimize that and we switched
to a local Bayesian classifier.So we'll take the first of
(33:49):
several tasks and we'll just dothat locally. So instead of
asking an LLM to verysubjectively decide what sounds
important or what the cosinesimilarity algorithm tells them
is important, we'll do thatlocally. We'll let the user use
the Thunderbird feature oftagging emails as priority one,
(34:13):
priority two, whatever. So forour experimentation, we had each
user tag a handful of messagesas highest importance and tag a
handful as least important.
And then the local Bayesianclassifier that we just included
as a JavaScript library in theThunderbird add on works very
quickly, even for lots and lotsand lots of messages. And so,
(34:36):
okay, task number one, taken offthe plate of the LLM. And so now
we just have it do the rest ofthe tasks. And likewise, the
formatting task, we just handlethat ourselves. A quick note
about formatting.
For a time when we were usingone of the other third party
cloud providers, we found thatyou could provide them a JSON
(35:00):
schema that the model wouldconform to when giving you the
response. That was a magicaltime for me as an application
developer because it's like, ohyeah, give me the JSON. I will
just put it through my frameworkand it's just going to render
the things beautifully. Like,look at my CSS applied so
perfectly to this. This isamazing.
(35:22):
As we were model hopping andprovider hopping, that kind of
went away. And we haven'treturned back to it because at
some point we realized that,okay, so learning our lesson
from before about splitting uptasks, we realized that we need
to take a different approach forthe daily brief. And this is
(35:43):
when I got it in my head thatlike, okay, so the future of
this feature is not to just keepon sequentially prompting the
same language model for like,hey, now do this, now do this.
Instead, I think of like, youknow, you go into a professional
chef's kitchen and you don't seelike this one giant tool that
(36:05):
can slice, dice, and microwaveand air fry on top of that. You
see lots of little dedicatedtools that are like in expert
hands, it does that one thingand it's going to do it the
best.
So one of the things that we'vegot that I have written down in
my mad science notebook is toexplore like, well, what if we
(36:26):
could dedicate some small modelsto specific tasks and then
coordinate them in some sort ofmuch more deterministic way? So
the daily brief currently, allthat to say, the daily brief
currently has been kind ofsidelined. And you know, we're
like, okay, so shipping assistmeans more or less summarization
(36:48):
and reply generation. And dailybrief, we still need to work on
that because again, the theapproach needs to be more more
granular and more moredeterministic.
Chris Benson (36:59):
I'm kinda curious.
As you've taken us through that
process, one of the things on mymind is how different kind of,
you know, I don't for lack of abetter word, audiences within
your customer base, differentprofiles, you know, how these
different approaches that you'vetaken as you guys have devised
the strategy forward onThunderbird. Where are you
(37:20):
seeing more uptake? Where areyou seeing, you know, people
like, know, we I think in earlyin the conversation, we talked
about younger generation who maynot have grown up with email
like we did. And then on the onthe far side of that, you have
kind of the corporate world andstuff like that and with a with
a, you know, a certain segmentof your of your user base in
(37:42):
different aspects.
I'm just curious how that mayhave how people are receiving
this in those differentcapacities, given the fact that
you have different interests andstuff.
Chris Aquino (37:52):
Yeah. Yeah. Well,
the the short answer is that our
users are fairly homogenous atthis point. Our users of
Thunderbird Assist are veryhomogenous. They are all
Thunderbird employees.
Gotcha. Because That helps. Itis Yeah, it does. It does. And
even though this work is it isopen source, it's on GitHub
right now.
(38:13):
We haven't released it, youknow, for general use because,
again, Flower had been tuningthe models and making changes to
their infrastructure. So theyweren't ready to receive, like,
a lot of users from all over theworld. So within Thunderbird,
(38:36):
there are even different needs.Some people use the individual
email summary feature. Okay, solet me back up.
There three features that areavailable in Thunderbird Assist.
The first of which, individualemail summarization. And if you
aim it at like a quoted thread,you could call it a thread
(38:56):
summarizer. There's email replygeneration, and then the third
one is the daily brief. Now forthe different kinds of users
fall into two camps.
There are the I would like youknow, this long thread from the
Thunderbird mailing list, I needthat summarized because, wow,
(39:17):
that is too long. The other kindof user is the one who gets way
too many emails and needs anexecutive assistant. And that
was where, as I just mentioned,it's like, okay, so that feature
is just not going to work verywell given the approach that
we've taken. But based on whatsome of the users have requested
(39:43):
in response to using ThunderbirdAssist, that's given us some
ideas of like, well, what weshould be focusing on is we need
things like semantic search. Orbecause it is a Thunderbird add
on and has access to more thanemail, like it has access to
your calendar account that's inThunderbird.
(40:05):
Thunderbird also does taskmanagement and it even pulls in
RSS feeds, which RSS, it'scoming back so strong. I love
RSS. I think that this idea oflike, okay, so if we could
correlate between thesedifferent pools of information,
(40:25):
that could be extremely usefulto some users, which brings us
back to the whole like, Okay,well, need small dedicated
models for each kind of databecause they're going to be
formatted very differently.
Daniel Whitenack (40:38):
Well, Chris, I
have sort of two well, an
observation and then maybe aclarification. So number one, I
love how you described thisprogression from kind of the one
tool to accomplish every task,which is often how people do
(40:58):
think about using these modelsdown to splitting this out into
maybe it could be differentmodels, it could just be
different applications of thesame model, but that are
segmented or these sorts ofthings. This is so often what I
recommend to people. It's kindof like when you have a junior
developer and they come to youand they're like, I wrote all
(41:20):
the functionality. It's all inthis one function, thousands of
lines of code.
You're like, Okay, we need tosplit this up. In this AI world,
there's that need for thatsplitting up and of course it
makes things more testable andall of that as well. So thanks
for highlighting that. I thinkthat's a really, really
(41:40):
practical and good point. Ithink the clarification, just
wanna make sure that peoplepicked up on, you kind of
referenced some of this workwith Flowr and we talked about
that remote inference.
If I'm understanding right,because you're running a local
application, the data that flowsto that remote inference is
(42:01):
encrypted on the device. So it'sencrypted in transit. And then
if I'm understanding Flowr'simplementation, you can correct
me if I'm wrong, that would onlybe decrypted sort of in a
confidential enclave in theinference infrastructure. So
that's when you say like evenFlowr, even if this is running
(42:22):
in their infrastructure, theywould not be able to tell you
what a prompt is. Did I pick upon that somewhat in the right
vein?
Chris Aquino (42:29):
Absolutely, that
is totally correct. It is for
any web developers or or, youknow, anybody who has had to
write software that interactswith an API, you're probably
communicating over somethingcalled HTTPS, which is sort of
like, that is that is thebaseline amount of encryption
that we want. It's gonna encryptthe traffic between your
(42:52):
browser, the client, and thenthe server. They take it a step
further. There is there's a Iwould call it a three part
process for making sure thatyour your data is protected.
So first off, let's say you'relogged in. Right? You you log in
to your your Thunderbirdaccount, which we created
(43:13):
specifically for Assist and someother services, which I will
talk about a bit later, then youare issued an API authentication
token from Flower itself. Right?You're logged in.
You're now going to talk toFlower. Flower's like, yep, you
logged in through Thunderbird.Cool. Here's your authentication
token. Use this now to exchangepublic keys with yet another
(43:40):
server, and that server doesnothing besides run the, you
know, run the language model.
And at that point, as you as youobserved that any traffic
between your client betweenThunderbird Assist specifically,
and then the machine running themodel, it's all encrypted in
(44:02):
between. So you have yeah,you're double protected, I
guess, HTTPS plus the public keyencryption between you and the
model server.
Daniel Whitenack (44:14):
That's great.
Yeah. And I think this is a
great way to maybe expandpeople's thought process around
what's possible with privacy andLLMs and how that can be split
up between like where the LLM isrunning, whether that's local or
not or both. So yeah, appreciateyou going into a few of those
(44:35):
details. I think it's reallyhelpful.
As we do get closer to the endhere, I would love to maybe just
kind of ask you to close us outby thinking about the future now
that you've run theseexperiments. You've kind of gone
through this process. I love howwe kind of went through this
(44:59):
kind of story of how thisdeveloped. That you've gone
through that process, as youlook towards the future, what
excites you about where thingsare at now and where they're
headed this project or maybe interms of like the wider
ecosystem that you're now a partof using this tooling around
(45:19):
kind of remote confidentialinference and that sort of
thing?
Chris Aquino (45:23):
Yeah, there are a
lot of exciting directions that
we could take this work. Again,this was sort of an initial
experiment, but we are planningon shipping this with what we're
calling Thunderbird Pro, whichis a suite of services. Like my
other web developer teammates,they're working on other things
(45:44):
like a application, a web basedscheduling application. There's
an end to end encrypted filesending application. And there's
ThunderMail.
I'm just going to say thatagain. ThunderMail, which is our
very own email service. Okay, soone of the things that could be
(46:08):
very interesting and perhapseven take advantage of federated
learning, thanks Flower, is ifyou could treat the server, the
email server or, you know,another machine that is co
located with the server asanother client, right, that has
access to your encrypted emailthat's on ThunderMail that, you
(46:30):
know, while you're asleep orwhile you were disconnected from
the Internet, it could becreating embeddings or doing
some other inference based onyour email data and then
transmitting the learnings toyour local machine. That Imagine
if you could do semantic searchwithout having to generate the
(46:53):
embeddings on your laptop. Andyou can do it in offline way
because as far as you'reconcerned, the embeddings are
effectively pre generated anddownloaded along with the
messages themselves.
The trick there, of course, isdoing it in a way that will
satisfy, you know, the moststaunch privacy advocates.
They're like, wait a minute, ifyou have a server that's in your
(47:13):
infrastructure and has access tomy email, then it's really not
end to end it's not encrypted.So we need to figure out a good
solution to that before we canexplore that. But some other
things that I alluded to earlierinvolve expanding that's such an
overloaded word expanding thecontext that the model has
access to. And I don't meanlike, you know, context window.
(47:37):
I mean like, okay, so giving itaccess to your calendar, to your
to dos, your RSS feed. What ifwe added a notes application to
Thunderbird and then effectivelyturning making it possible so
that Thunderbird could be usedas an LLM assisted personal
(47:58):
knowledge management andcommunication tool. That
whatever future that looks like,that's that's more exciting to
me personally. I'm one of thosepeople who I have notes from the
last handful of like, couple ofdecades that I still keep
around, and I would love an LLMto help me sift through that. It
(48:19):
would be even more interestingthat as I'm making a note, it
could suggest related documentsand ideas that I've had in the
past.
Or just I mean, for a lot ofusers, just helping them stay
organized. Because again,there's so much for one tiny
human brain to keep track of,and there's just so much
information. So I think that forme as not as, you know, an ML
(48:47):
researcher or an AI expert, I'mjust application developer. I
want to work on that. I want tobuild that and make it possible
for people to have more controlover their information, help
them retain their privacy, but,you know, make those creative
connections that only they as ahuman can do.
But an LLM, local orconfidential remote compute
(49:13):
assisted, reminding you of like,oh, you wrote this, here are
some things that you've writtenabout that are related to that.
Or here's some conversations youhad in email or in chat. And
then for you, the user, as justa regular squishy brained human,
you're like, oh, I just had thisweird random flash of insight
(49:34):
based on this constellation ofinformation that I generated
over years. I think I like thatfuture of AI. And also as an
application developer, I thinkthat I really want LLMs to be
more deterministic.
Like, it's so weird to call anAPI with the same data and get
(49:56):
very different results. And wecan get into this or not, but I
definitely feel like chat is thewrong interface for a lot of
tasks.
Daniel Whitenack (50:06):
Yes, thank
you.
Chris Aquino (50:07):
Okay, cool. I just
want to make sure I'm in good
company. So I've got I don'tknow, again, in my little mad
science notebook, I've got ideasaround like, okay, how you swap
in deterministic functions? Howdo you coordinate the efforts?
And I think something beyond Imean, maybe I'm describing a
(50:29):
more strict version of MCP, butthe fact that your input and
your output currently is plainlanguage is it's a double edged
sword, because the only way todetermine if you've got a bad
result is for you as a human toevaluate it.
Unless you spin up anotherlanguage model to verify the
(50:52):
first result. But as aprogrammer, that feels a lot to
me like, Oh, I just wrote afunction. And the only way to
know that if my function callwas correct is to write another
function to check it. Yes. Andit just feels like it just feels
wrong.
So I really yeah, I wantdiscrete inputs and outputs. I
(51:15):
want language models that aresmall and dedicated to specific
tasks. And then want reusable,shareable ways of wiring them
together. I want to createessentially workflows of
information processing withinThunderbird. So that's me.
(51:35):
I'm the personal knowledgemanagement cheerleader at
Thunderbird. That's my newtitle.
Daniel Whitenack (51:41):
That's
awesome. Well, that future is
one that I could get on boardwith for sure after struggling
with a lot of the things thatyou mentioned as well and also
hoping for many of those things.So yeah, thank you so much for
sharing this journey and thisexperimentation that you've been
on with Thunderbird, please keepup the good work. Give our
(52:04):
thanks to the team for inspiringus with a lot of amazing work,
and thanks for sharing yourinsights here with us.
Appreciate you taking time.
Chris Aquino (52:14):
Yeah, thank you so
much for having me. I'm really,
really glad that we could makethis happen.
Daniel Whitenack (52:20):
Us too. We'll
see you soon.
Chris Aquino (52:21):
All right, thanks.
Jerod (52:30):
All right. That's our
show for this week. If you
haven't checked out our website,head to practicalai.fm, and be
sure to connect with us onLinkedIn, X, or Blue Sky. You'll
see us posting insights relatedto the latest AI developments,
and we would love for you tojoin the conversation. Thanks to
our partner Prediction Guard forproviding operational support
for the show.
Check them out atpredictionguard.com. Also,
(52:53):
thanks to Breakmaster Cylinderfor the Beats and to you for
listening. That's all for now,but you'll hear from us again
next week.