Sovereign AI: Using LLMs Without Sacrificing Privacy - The Sovereign Computing Show (SOV013) - ATL BitLab Podcast

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Stephen DeLorme (00:00):
Sarah, a loving and caring girlfriend.

(00:03):
A loving and caring girlfriend.
She will do dot, dot, dot.
I'm just like, is thisthe end times, Jordan?

Jordan Bravo (00:08):
Oh yeah, I saw an article that said if you have an AI
girlfriend and it's not local only,then your AI girlfriend is cheating
on you, which is kind of true.

Yeah, that's so true.
But also, that's a funny joke.
But Oh my God, we are the end times.

Welcome to the Sovereign Computing Show, presented by ATL BitLab.
I'm Jordan Bravo, and this is apodcast where we teach you how to
take back control of your devices.
Sovereign Computing means you own yourtechnology, not the other way around.

This episode is sponsored by ATL BitLab.
ATL BitLab is Atlanta'sfreedom tech hacker space.
We have co working desks,conference rooms, event space,
maker tools, and tons of coffee.
There is a very activecommunity here in the lab.
Every Wednesday night isBitcoin night here in Atlanta.
We also have meetups for cyber security,artificial intelligence, decentralized

(01:04):
identity, product design, and more.
We offer day passes and nomad passesfor people who need to use the lab only
occasionally, as well as membershipsfor people who plan to use the lab
more regularly, such as myself.
One of the best things abouthaving a BitLab membership isn't
the amenities, it's the people.
Surrounding yourself with acommunity helps you learn faster
and helps you build better.

(01:26):
Your creativity becomes amplifiedwhen you work in this space,
that's what I think at least.
If you're interested in becominga member or supporting this space,
please visit us at atlbitlab.
com.
That's A-T-L-B-I-T-L-A-B dot com.
Alright, on to our show.

Welcome to the Sovereign Computing Show.
I'm Jordan Bravo, and I'm heretoday with Stephen DeLorme.

Sup!

We wanna remind you that you can contact the show by boosting in
via Fountain FM and search for ATL Bitlab.
If you send us a Boostagram with a messageand some SATs attached, we will read it.
On the show and take yourfeedback and respond.
If you don't wanna send in aboost with Fountain or any other

(02:14):
method, you can also email us atsovereign@atlbitlab.com and we will read
your feedback and respond to it as well.
There's an article that we want to talkabout today, and it is about Smart TVs.
This is a ZDNet article and we willlink this in the show notes so you

(02:35):
can go into detail later based onyour specific brand of Smart TV.
But there's something called ACR,Automatic Content Recognition.
And this is a tool that tracksyour viewing habits that's built
into the Smart TV's, software.
And it's on by default.

(02:56):
And so this article tells talksabout a little bit what it is.
Essentially it is a I wanna say a protocolor a, a standard that Smart TV is used to
collect data about your viewing habits.
And this can be used as adtracking and surveillance.

(03:16):
And so it's, it's something thatis pretty standard nowadays.
I think what most of ushear something like this.
We are not surprised anymorebecause we just assume every
device that we buy is tracking us.
So Smart TVs are one of the thingsthat we have less control over.
Smart TVs are very muchnot sovereign computers.

(03:37):
They are completely captured devices,uh, that we have little control of.
So unless you're a real hardware hackingtype person who likes to get in there and
tinker with things and flash firmware,which, I am not, I'm purely a software
person, so that's beyond my ability andI, I would guess most people's as well.

(03:58):
But if you are just buying anoff the shelf Smart TV, it's,
it's most likely has ACR enabled.
So this ZDNet article goes intodetails for all the different
manufacturers of how, of the stepsthat you can do to disable it.
I'm not gonna go through the stepsbecause it's just tedious and it
wouldn't make for, um, good listening.

(04:18):
But if you, Stephen, can you scrolldown to the steps and just kind of
show all the brands that they list?
Yeah.
So we have Samsung, LG, Sony, Hisense,TCL, and other Roku-powered TVs.

(04:38):
Okay, so four different brandsin this article right here.
Yeah, and so if you have one ofthose brands, this will give you a
step by step on how to disable it.
It's just another thing to takea little bit more privacy back.
Obviously, we wish we, the ideal worldwould be to have a dumb TV rather than
a Smart TV where you can then plug-ina device you have more control over.

(05:02):
But if you have a Smart TV, it'sprobably a good idea to go through
and confirm that this is disabled.
And your TV is a brand that's not on here,my guess is there's still a way to do it.
So use a search engine, ask AI, that'sgonna be your friend if you wanna do that.
Stephen, do you have anyadditional thoughts about this?

Uh, this is creepy.
I mean, I'm actually a littlesurprised at how this works.
Uh.
I was like kind of perusing the articlehere and reading about how ACR works.
ACR does this by capturing continuousscreenshots and cross-referencing
them with a vast database ofmedia content and advertisements.
ACR can capture and identifyup to 7,200 images per hour,

(05:47):
approximately two images per second.
That's wild.
So I mean.
You can have a lot of apps that comelike pre-installed on these Smart
TVs like Netflix and YouTube andother content streaming services.
And I was not under any illusionthat I wasn't being tracked by

(06:10):
those service providers, right?
Like, you know, YouTube and Netflixare going to try and you know, record
as much data as they can about yourviewing habits and what you watch.
And I guess when you brought thisarticle up, I kind of assumed that
ACR was just some kind of spec forsorts of services to be able to share

(06:31):
information with marketers or with the TVmanufacturer about what is being watched.
That's not what it looks like.
It looks like it's actually just the TVitself is occasionally taking screenshots
of what you're watching and comparingthat with a database of other media
content and advertisements to identifythe content that you're watching.

Yeah, it's almost like an agreed upon surveillance API that's
agnostic to the app and advertisersand, and other people can just hook
into it and, glean this information.

I, I, at this point, I don't even know if it's agreed upon.
I don't know if that would make it betteror worse for me, but it seems like these
TVs can just, you know, take a screenshotand record and that kind of happens at
the level of the TV operating system.
I'm not really sure that there'sanything that Netflix or YouTube
or Prime Video could do to stop ascreenshot from being taken at the

(07:31):
level of the TV operating system.

Yeah, so I see what you're saying.
So this is less about the appsand more about the OS level and
the, the manufacturer level.
I.

Yeah.
Again, this is my first time readingabout this, so I'm not sure, but
that's at least what I'm seeing here.
Uh, that's, that's my understandingbased on this, just kind
of skimming this article.
Let me put it this way.
If the TV manufacturer was collaboratingwith all of the major, over the top
streaming providers, they wouldn't needto take screenshots every two seconds.

(08:04):
They could just say, "Hey, YouTube,Hey, Prime Video, share with me what the
user is watching." The fact that they'rehaving to take a screenshot or have the
capability to take a screenshot everytwo seconds and identify what you're
looking at tells me that it's almostlike a separate surveillance apparatus.

Yeah, that sounds accurate.

Well, I'm definitely going to see if this is a setup
on my TV and turn this shit off.

I do wanna talk about some takeaways from this in
addition to what the article lists.
So let's think outside the boxhere and shift the paradigm
to use some business jargon.
Taking these steps to disable ACRis definitely a good first step.
Let's think about why are we possiblyin this situation in the first place?

(08:54):
Well, it's because we have a quote unquoteSmart TV with an operating system and
applications that we have no control over.
So the most ideal sovereign computingdevice when it comes to TVs would be
to have a dumb TV or TV without theSmart TV capability built in, you

(09:16):
know, just a regular TV for many years.
This is.
But it was basically like a monitor.
You just plug in your HDMI or whateverinterface, and it just shows, you
know, it, it's literally just pipingvideo data from the source device.
However, now you have a whole littlecomputer, with an operating system and

(09:38):
apps that can be written developed bycompanies like, like Netflix, etc. And
so now there's a whole layer of stuffthat's going on that we don't really
know much about above and beyond.
Just like straight hardwaretranslates, hardware and
firmware translation of, of data.
And so this of course, has opened upthe door for stuff like this, like

(09:59):
surveillance and at both the operatingsystem level and the application level.
So what can you do about this?
Well, you can opt to notuse the Smart capabilities.
It's hard to find TVs these days thatare don't have Smart capabilities.
In fact, it's probably impossible, but youdon't have to use the Smart capabilities.

(10:20):
If you do not connect your Smart TV tothe internet, it can never spy on you.
I mean, it can try to spy on you, butit can never send that data anywhere.
So if you wanna keep everythinglocal on your TV, that's the
best way to do it, in my opinion.
And what you do is you just treatit like a good old fashioned tv.
You are going to plug in HDMIwith whatever source you're using,

(10:45):
and then let the TV just be a tv.
Now if you are plugging something inlike Apple TV or Roku or that kind of
thing, Amazon Fire Stick, you stillwill have some surveillance problems
at the app level, but at least the TVitself is not spying on you and doing
this kind of shenanigans like this ACR,and then when we, we can talk about

(11:07):
ways to mitigate the app level, too.
For example, I'm thinking maybe thisrequires a whole episode in itself
or, or a deep dive in another episode.
So, uh, let's leave it therefor now and then we'll get
into this in another episode.

Yeah, that sounds like a good plan.
So if anybody knows anything moreabout ACR, maybe if any of the
stuff, conclusions that I jumpedto there were inaccurate, you
can boost in and let us know.
And whatever your concerns may beabout the privacy around content
streaming apps, boost in, let us know.
We'll talk about it inthe future some more.

Don't forget, you can also email us at sovereign@atlbitlab.com.
Alright, so today's main topicthat we're gonna dive into is how
to use AIs and LLMs privately.
LLMs are a booming topicthese days, and rightly so.

(12:05):
They enable some really,revolutionary productivity.
And so when we use these LLMs,typically we're interacting
with a web app front-end.
Then it's connecting to LLMs, runningon people, on a company servers, and
if these companies happen, go ahead.

I wanna, sorry, I wanna jump in real quick and just
define for anyone who you know doesn'tspend their time in tech all the time.
LLM is a Large Language Model andbasically the technology under the hood
that powers a chat bot like ChatGPTor Claude or something like that.

Yes.
Thank you.
And so when people talk aboutAI in popular culture and the
news, they're typically referringto LLMs, Large Language Models.
There is, I don't know, would, um, wouldimage generation fall under the category
of LLM or is that a separate category?

Separate category, but related.
In my mind, I categorizeit as generative AI.

Yeah.

AI that like generates stuff.
And some of those are things likeLarge Language Models, which generate.
Which is kind of like a text to text.
It's like text input, text output.
And then you have thelike, the art models.
I'm not sure exactly what thetechnical term for that is.
It might be diffusion model, Ithink, but that's like text to image.

(13:27):
And you've got a whole variety ofdifferent kinds of generative models.
You also have audio to text for liketranscription, and you have text to
video and you've got image to video.
So there's like a whole wideuniverse of generative AI models.
But, yeah, definitely I think the onethat's most common among like the, the

(13:47):
majority of people you know, at leastin the, the western world or the tech
adjacent world is like LLMs chatbots.

Got it.
Thanks for that.
That, that makes sense.
So you have the broad categoryof AI and then under that
you might have generative AI.
And then under that, you might have,one would be image generation and
another might be Large Language Models.

Yep.

Okay.
So within those two broad categories,LLMs is typically what people think of
when they're thinking of ChatGPT, Grok,I'm blanking on the other ones now.
Gem, Google, Gemini, Claude Uhhuh.
Exactly.
And so these are the ones where you'retalking into a, a chat interface and
you send a chat, a question or somethinggoes to their, the company's servers.

(14:39):
They run it through an LLM andthen they send it back and you
get a text response or you, maybeit'll write some code for you.
But typically that's, that'swhat we're all familiar with.
Now, the issue, the problem that wewould like to tackle here is how do
we take advantage of productivitythat LLMs give us without completely

(15:03):
doxing all of our information or justfeeding it to the surveillance machine.
It's worth noting that OpenAI,which is not a open, for those
of you who are unaware thename is, is sort of a misnomer.
Probably intentionally, butthat's besides the point.
OpenAI, Google, Microsoft's, GitHubCopilot, and I think they have another

(15:27):
LLM, I'm not really remembering.
But the point is, these companiestheir whole business model is
collecting as much data as possibleand using it to train these LLMs.
And then, using it for surveillance,for adware, and ad content
and data sharing collection.
So all of these things are, are parfor the course for these companies.

(15:48):
We should not be surprised that,that they're doing this anymore.
I mean, we've been paying attentionto this stuff for long enough.
We should know this.
Something to note is that anex-chief of the NSA is now on
the board of directors of OpenAI.
So you could see there's thisdirect government surveillance
to big tech pipeline.
That should again, shouldn't surpriseus, but still disconcerting to see.

(16:11):
So how do we use AI more privately?
Well, there's kind of twodifferent ways you can go about it.
The first one we're gonna talkabout is the maybe easier way that
doesn't require as much technicalknow-how and setup, and that's gonna
be using more private AI providers.
These providers are not, this is notgonna be a completely trustless approach.

(16:34):
This is just gonna be atrust minimized approach.
And, and also shifting trust tomore, privacy reputable sources.
So the first one that we wannatalk about is called Maple.
Maple, I think the website istrymaple.ai and this is a cool project.

(16:55):
It's using a technology calledConfidential Computing, and
this is a little bit of trivia.
This is created by the, some of thefounders that originally created
Mutiny Wallet and after MutinyWallet was shuttered as a project.
Most of the creators of Mutiny, theyare now working on this Try Maple and

(17:17):
Try Maple is actually a, it's an AIchat, just like ChatGPT, for example.
But where it differs is they are using theconfidential computing hardware on Amazon.
It's, it's called for thetechnically minded folks out there.
This is called Amazon Nitro and it is atrusted or trustless computing platform.

(17:42):
Is that the right term, Stephen?

Uh, I think some people call it that I think.
I would call it, um,confidential computing.
I think they're roughly the same thing.

Some of you might have heard the term secure enclave when it comes
to like, hardware on your, your iPhoneor Android phone or something like that.
And this is a similar concept where theyare manufacturing specific hardware to
run cryptography in a way that minimizesthe trust of the, of whoever's running it.

(18:15):
So if you wanna run AI on a professionalhardware, which all of these companies
do, so that the user doesn't haveto run their own expensive hardware.
You have to run a server.
And the problem is, as a user, howdo we know that that server is secure
and not leaking all of the data?
Well, the way that Maple and thisconfidential computing approach works,

(18:38):
is it at a hardware level, it - Again,you're shifting the trust here.
Now, the, the hardware manufacturingprocess has to be trusted, but assuming
that process is not corrupted, whathappens is when it says running on
Amazon servers without confidentialcomputing, you could have, uh, an angry

(18:59):
s or a, a malicious sis admin actuallysee all of the data that's taking
the computing that's taking place.
But with something like confidentialcomputing on, on Amazon's AWS Nitro.
It's essentially a black box anduses to the, to assists admin
or whoever's administering it.
But it is provably secure,cryptographically, provably running the

(19:23):
same code that, that it's attesting to.
So Stephen has just demonstratedTry Maple or, uh, Maple by asking
about confidential computing.
And is there anything in therethat you, you think is useful
that you wanna talk about?

Not particularly.
I think it's a, a deep topic.
Since I think what we've been doing onthis show thus far has been kind of,
actionable, sovereign computing things.
I don't know that I want to dive toodeep into some of these technical
details because I think yourexplanation is pretty much sufficient.

(20:02):
As long as you know, they're,they're running the right hardware
and all of that, then you know,in theory it's provable that.
You know, the, the computeis indeed confidential.
Personally, I think it's a good wayto move forward as a society because
I would like to see more productsbuilt on top of these things, because
you know, like not everyone's gonnaself-host a file server, right?

(20:25):
And so I think that, you.
adopt confidential computing as a startupor as a company or whatever, you know,
you can offer that greater level ofprivacy to your users and the users
can actually benefit from, um, massiveamounts of compute that are in the cloud.
Right?
like it's kind of nice having cloudcomputers that power a lot of the, uh,

(20:48):
services that we use on a, a daily basis.
And it's kind of like thisalmost shared resource.
You can benefit from those computeclusters when you need them, without
needing to run that stuff locally.
So, but the trade off is privacy.
So, uh, confidential computing isone of those things to me that just

(21:08):
seems like, a no brainer, in termsof being able to move forward.
As far as this service go, I willsay I do like Maple and I love
Open Secret, the company that likemakes the technology behind Maple.
Having said that, I think right nowthe only model that they run, if I'm
not mistaken, is just the, what's itcalled Llama 70B and I, I have asked

(21:32):
it some complicated questions beforeand it kind of stumbled a little bit.
If you are used to using GPT-4o or likeGPT-o1, you might find the Llama model a
little bit lacking, or if you're, you'rea fan of the Claude models and all that,
you might find it a little bit lacking.
Having said that, you know, this is kindof a proof of concept of open secret.

(21:56):
If you're asking this thing, youknow, less complicated in mathematical
questions, I think Maple will totallyget the job done for you, and there's
nothing to prevent the, the Try Mapleproject from hosting DeepSeek or other
more advanced models in, in the future.
So as a proof of concept of runningan LLM in a confidential compute
environment, it basically works.

Yeah, that's a good point.
It's a new project, so they are still,uh, they're not feature complete.
They're gonna add more in the future,but it is nice to see them doing this.
it's fully open source and so it'sa great project to keep an eye on.
The next third party LLM providerthat's more privacy focused that we

(22:39):
wanna talk about is Venice, Venice.AI.
And this one was started by EricVoorhees, who of, uh, fairly well known
in the Bitcoin and Crypto community.
And this one is they focus on privacy,um, with, there's a, a UX trade off.

(23:02):
And what I mean by that is theydo not sync your chat history.
So if you were to open this in abrowser on one computer and sign in
and, and, uh, you know, do some chats,and then you sign in on a different
browser, on a different computer, it'snot gonna sync that chat history over.

(23:25):
Uh, but the upside is that your chatsand your, your history is not being
sent to Venice, or at least it's notbeing saved in any meaningful way.
Now again, we have to trustthat, that they're actually doing
what they say they're doing.
, Their whole, uh, business modelis to be a private LLM provider.

(23:46):
And so they're, it's moretrustworthy, I would say, than one
of these other companies where theymake no promises about privacy.
In fact, there, their terms ofservice are quite to the opposite.
One of the cool things about Veniceis that they have, uh, a bunch of
models that you could see, and I'veactually paid for a pro membership.

(24:08):
I just wanted to try it out becauseyou can pay with Bitcoin and that's
another way in which you're reallygetting a lot of privacy because when
you pay for ChatGPT or Claude orany of these, you have to use a credit
card or a debit card, and that'sinevitably tied to your account.
You can use something likeprivacy.com to have a layer of,

(24:29):
of, uh, privacy between your bankaccount and one of these companies.
But ultimately there is a link there.
And so when you pay with Bitcoinon Venice.AI, for example, that's,
that to me is a huge selling point.

It looks like they also have image generation built
in here too, which is kind of cool.

They do have image generation.
We're gonna talk about another opensource way to do image generation, but
this would be competitive with somethinglike, I'm blanking on the name now.
What, what's the big image,image generation service?

Uh, well, MidJourney is, uh, one of my faves, if

That's the one I was thinking of.
MidJourmey.

There you go though.
That was actually a verysnappy image generation.
I asked him to make a cipher punkhackery hacker, soldering his own circuit
board because he doesn't trust intel.
And we got this guy in a black shirt.
He's not soldering, he's really usinglike one of them skinny screwdrivers,
but he's kind of, uh, peering up, peeringat us over the rim of his glasses.
Like, "I don't trust you.Get the fuck out of my lab."

That's actually kind of a cool image.
I've played around withthis before and it's decent.
if you go to the image style, youcan see a bunch of cool options.

Oh, this is weird.
Oh, and I get it becauseit's all like, um, they're,
they're, they're called Venice.
All of the, uh, examples and theimage style are all these like,
I don't know, Italian Renaissancelike masquerade ball, masks.

I think it's called Venice, for the same reason that Allen Farrington
has that, that post and then book,Bitcoin is Venice, and it's sort of about
the Renaissance era in Venice, where weare now experiencing a new renaissance
with things like Bitcoin and sovereigncomputing type technology advances.

I was trying to get to give it to me in a GTA
style, but whatever, uh, I won'twaste your time playing with this.

Yeah, no, this is fun.
I, I definitely recommend peopletake a look at it and try it out.
I. I've used it to some good effectand I, it feels good knowing that my
searches and history are more private.
But like you said, these open sourcemodels while competitive in some
respects with the closed source ones,they're not making gazillions of

(27:00):
dollars, and so you are often seeing thefeatures in the UX lag behind ChatGPT
and Claude and that kind of thing.

I gotta say though, this one is actually,
seems like a, a competitor.
I mean, definitely, uh, it's, thisisn't quite MidJourney level, but uh,
th this one seems to be catching up.
I, I might've missed when yousaid earlier, but do they offer
any insight as to how this runs?
Is this also in a confidentialcompute environment?

This is not using confidential computing.
This is more like they are justproxying it to their own server
and they are promising that theydon't log your chat history.
So this is more of a shifting of trust.
But again, they are basing theirbusiness model on we are providing

(27:48):
privacy that's why you're paying us orusing our service in the first place.
And so that, that's kind ofwhat you're going on there.
And then the otherthing, like I mentioned.
This is all in your whole historyand, and stuff is in local storage.
So, uh, ostensibly, your, your accountis not your chat history specifically is
not being sent or saved on the servers.

Got it.
Okay.
That makes sense.
Oh yeah.
Different trust model than with Maple.

Yep.
The next service I wantedto mention is called Kagi.
K-A-G-I.
And Kagi is a newer searchengine, is their main product.
And they are a, they advertise themselvesas a privacy first search engine, that you
actually pay for to get a premium search.

(28:37):
But they also have, an AI assistant.
And so if you, you buy their promodel, their pro subscription,
you get access to the Kagi AI.
And while I have notused this one personally.
I've heard a lot of good things aboutit and, supposedly it's, it's got some
good UX and a pretty powerful model.

So they can use all the, looks like all of the
Claude models, a bunch of OpenAImodels, Mistral, Google, Meta,
Alibaba, and DeepS eek of course, uh.
Looks like they have a lot ofdifferent, uh, LLMs available.
And, and just on that note, I mean,one thing worth mentioning here
with Kagi, and I'm trying to seeif this applies to Venice as well.

(29:21):
Um, so When it comes to somethinglike Venice, you know, it looks
like they're running Llama, Mistral,Dolphin, DeepSeek, so they're probably
hosting all those models on their own.
We go to Kagi, you know, andwe can see that they have LLMs

(29:41):
available in the assistant.
You know, some of these, they mightbe self-hosting, like perhaps the
DeepSeek or the Llama models, butfor the Anthropic and OpenAI models
and pro, I don't know about GoogleGemini, but maybe that one as well.
But definitely with the Anthropic andOpenAI ones, they're probably just
using the API for those services.

(30:02):
So your data is still, if you useKagi and if you choose an Anthropic or
OpenAI model, your data is still beingsent back to Anthropic and OpenAI.
Having said that, it may be thatKagi doesn't identify all of your
separate chats to those services.

(30:23):
So for example, if I were touse GPT-4o from OpenAI inside of
Kagi, and I had 10 separate chats.
OpenAI might see that as like 10 differentuser, like completely different users.
Or it might see that data just like allmixed in with like all of Kagi's requests.
Right?
Whereas if I went through OpenAI directly,I would of course have an account

(30:44):
that I log in and I would have like 10different chat histories with ChatGPT.
So it's good.
I think it's great, but just throwingthat out there that like if you do
use this service and you do choose oneof these proprietary models, you're
still sending some data back to them.
So word of caution there.

That's a great point.
This is a spectrum here, right?
If you could think of using OpenAIor Anthropic directly through
them, that's probably the leastprivate on one end of the spectrum.
And then if you use Kagi and yourKagi is acting kind of like a proxy
between you and Anthropic or OpenAI.

(31:23):
That would be a little bit more privatebecause like you said, it's sort of
like a VPN provider where you are,the VPN provider has a whole bunch
of users and all of their traffic issort of mixed together and anonymized.
And in the same way here, you mightthink all of Kagi's users going to
Anthropic, let's say, uh, they, allof that traffic is gonna look like

(31:45):
it's coming from Kagi versus you as anindividual, and so all of your searches
and chat history is not gonna be tiedtogether to your individual account.
But like Stephen was saying, your chathistory, your, the things you send
to it are still gonna be on Anthropicservers, on OpenAI servers because by

(32:06):
necessity it has to be processed there.

Yeah, and if you, and you know, depending,
and so it's like you just.
Just have to be careful depending on whatyour privacy threat model is, that you
don't dox yourself with your question.
So if you're asking something like,"What is a good JavaScript library
for doing XYZ thing?" It's like, well,that might be a little more innocuous.
You know, there's a lot of JavaScriptdevelopers out there in the world, right?

(32:31):
But if you ask for like, something like"What are good, uh, restaurant options
near 123, Main Street, Townsville, USA.
It's like, well, you've just givenout your address right there.
So you've kind of, potentiallyidentified yourself.

Yep, exactly.
And so if we mo keep moving onthat spectrum from least sovereign
computing to more sovereign computing.
The next one I would say after Kagiwould be something like Venice where
we, uh, we just talked about where theyare running their own open source models
on their server and they are purposelynot logging your history and doing data

(33:11):
collection on it, but in the end, you arestill having to send data to that server.
I would say maybe slightly more alongthe sovereign computing spectrum would be
Maple that we talked about first, becausethey are doing the same thing where you
have to send data to them, but they'reusing this confidential computing hardware

(33:32):
and software in order to run it in away that minimizes trust on their part.
Then let's talk now about moving evenfurther along that spectrum to more
private and sovereign computing withLLMs, and that's gonna be running
models locally or self-hosting.
That I kind of say those as twodifferent things because you can run a

(33:53):
model locally where your client lives.
So for example, you could just runit on your laptop, where your laptop
is gonna have both the interface thatyou type into and also the model.
But then if you wanna get reallyadvanced with it or, or really, um.
If you want to go more into anadvanced setup, you could self-host

(34:15):
the model on a server and thenconnect to it from any client.
So your laptop or yourmobile device, et cetera.
So the first thing we're gonnalook at is called Ollama.
Ollama is a program that you candownload on Mac, Windows, or Linux.
And once it's downloaded, it'sgot an interface where you can
select any open source model.

(34:37):
And Stephen, go aheadand click on the models.
Let's take a look at allthe options that they have.
So they've got things like DeepSeek,Llama, Mistral, and these are all of
the well-known large language modelsthat are open source and um, fully
available, free and open source.

(35:00):
Once you have llama installed, youcan search through and download
any of these models and it'sgonna run completely locally.
And then, uh, I believe it's just a CLIwhen you first download it, but we're
gonna show in a moment here, anotherthing you can run that will give you
a gooey interface in your browserthe same way you would expect from

(35:23):
ChatGPT or one of these other options.

And I'll go ahead and say that even the CLI was,
I thought, pretty easy to use.
Having said that, I'm alittle more comfortable with
CLI than the average person.
But as far as CLIs go, thisone was pretty easy I thought.
When it comes to downloadingmodels, there's a popular website
you can go to called Hugging Face.

(35:48):
And Hugging Face is, I don'tknow how to describe it.
Like

They're, they're kind of like the GitHub for open source models.

That's exactly what I was gonna say.
It's like the GitHubfor open source models.
The problem is, is that whenever I goand find a model that I want to use,
I have no idea how to start usingit because I'm just not one of these
like crap machine learning engineers.
They're always like, "Oh, here'sinstructions on how to use this model",
and then they give you like a bunchof, I don't know, Python commands.

(36:20):
Well, this one's in Chinese becauseit's DeepSeek, but, but they'll,
they'll just give you like, there'sno consistent way to just install all
these models at least that I found.
I found it very, very confusing.
I'm used to the web development worldwhere everything is, you know, run either
through a package manager like, you know.
NPM or back in the day Composer.

(36:41):
I, I just don't know how to work withany of the instructions on Hugging Face.
So, Ollama to me is just like a dreambecause once you installed it on your
computer, you can just like open up yourterminal on Linux or Mac and you can
just type Ollama LLMs and it'll listout all the models you have installed.
And you can say, Ollama installand just type in the name of
the model and it'll install it.

(37:02):
And it'll download it and run it.
You just type Ollama, run in the modelname, and it starts up a chat, and the
chat is entirely local to your computer.
So yeah, I thought the wholething was, just, a dream to use.
Like, uh, Ollama made it, very, veryquick and easy for me to get into, running
models locally, where Hugging Face wasan obstacle to someone of my skillset.

You said something that I think is key and worth
repeating, which is this runscompletely locally on your machine.
So if you turned on airplane mode,or, or just, you know, disconnected
from the internet completely, youwould still run this and get results.
This is not living on any otherserver and it's not reaching
out to any other computer.

That's actually a fun exercise.
Pair programming with an AI and a local,local environment, uh, on an airplane.

Exactly, so maybe you're on an airplane with your laptop and
for whatever reason, they don't haveinternet that you can connect to.
You could just use a, a localLLM and use it like that.
Now, as Stephen mentioned, thiswas, you have to interact through
this via the command line.
And for somebody who doesn'twanna do that, there's a

(38:17):
great option called OpenWebUI.
And you could think of this as thegraphical counterpart to Ollama.
What you would do is you run this and itruns in a browser and then you, it talks
to Ollama on the backend, and so in thesame way it is all running locally, but
you're getting that that web browsergooey interface that you would expect

(38:41):
from one of these other offerings.
The cool thing about this is that you candecouple the front-end from the back end.
So if you wanted to run Ollama onyour server, which is always on.
You could then connect to it from,um, any device remotely so you could
connect to it from your laptop or mobile.
And boom, now you have a fullymobile, totally private and

(39:06):
self-hosted, LLM experience.

I somehow found myself on the OpenWebUI homepage and it, I
mean, this looks like a really great UI.
And I'm seeing the section here thatsays top models and uh, you know, number
one is Based Dolphin Mixtral Dolphinmodels with a special system problem.
That's fine.
Number two is Codewriter
That's fine.
Number six, Sarah, a lovingand caring girlfriend.

(39:32):
Uh, a loving and caring girlfriend.
She will do dot, dot, dot.
I'm just like, is thisthe end times, Jordan?

Oh yeah, I saw an article that said if you have an AI
girlfriend and it's not local only,then your AI girlfriend is cheating
on you, which is kind of true.

Yeah, that's so true.
But also, that's a funny joke.
But oh my God, we are the end times.

Yeah.

Back to the topic.
Yes, this looks really cool.
I mean, you know, they have a,a fun looking little website.
Um.
They have a white paper.
Everybody's got a white paper these days.
Um, but, uh, you know, that's boring.
I'm not gonna go through that.
Um, but yeah, the, the screenshotsof this look really nice.
I remember you, um, demoing thisto me several months ago and it

(40:19):
looked good and it looks like it,has grown as a project since then.
It's a really simple lookinginterface that reminds me of ChatGPT.
It doesn't present anychallenging user experience.
So, um, it looks cool and, again, thisgets into web hosting and all of that kind
of stuff, so I, I think to use this, you,um, kind of need to, you know, be a little

(40:45):
bit comfortable deploying something.
That being said, yeah, you could probablydeploy this to something like Netlify
or Vercel in a, a, you know, somewhatautomated way that wouldn't really
require you to manage a server, directly.
But yeah, if you did, if you do have thecapabilities to self-host both Ollama
and this UI, uh, either at your homeor on a server, you know, you could

(41:06):
provide like a, a local chat bot for yourfriends and family, something like that.

And I know that StartOS by Start9, they have a A one click
install for, it's called GPT uh, FreeGPT, I believe is what it's called.
But anyway, if you just look in the StartOS marketplace, they have a an LLM you
can run locally on your Start9 server.

Very cool.

Uh, the next one I wanna talk about is for
running a local model is GPT4All.
And this one is similar to Ollama inthat you download the app for Mac,
Windows, or Linux and then you canchoose which models you wanna run.
And I think the where this one differsa little bit, I could be wrong, but this

(41:56):
one comes with an graphical interface.
Do you know if that's correct?

You know.
I'm actually not familiar with this one.
I'm really not.
I have heard the name Nomic before.
It's like, the company logothat's on here, but I don't
really know much about them.
So yeah, I'm not familiar with this one.

Okay.
Well, for the audience, let's justsay that this one is worth looking at.
Um, ah, okay.
It looks like

it does come with quickstart page.
Mm-hmm.
It says install GPT4All for youroperating system and open the application.
You can download it forWindows, for Mac or for Linux.
And then yeah, it's gotthis nice little UI.
It's, uh, light mode with some, uh,green garnishes and there's a tab
where I can manage all my chats.
And then there's another tab whereI can choose my models and all that.

(42:48):
I'm really kind of curious what the localdocs thing is, um, but it looks cool.

Yeah, so this is another great option.
I've been meaning to try this one out,but, uh, it looks, it looks great.
The next one we're gonna talkabout for local LLMs is LM
Studio, Language Model Studio.
And this one also has its own graphicalinterface that you can download for

(43:16):
Mac, Windows, and Linux and you can getstarted right outta the box with it.
Um, Stephen, do you have anyexperience with this one?

Yeah, I've run this one before.
It's uh, it's pretty handy, because ithas like the downloading of the models and
the running of the models is all bundledinto one interface, which makes it pretty
accessible and easy if you want to be ableto experiment with running these models.

(43:44):
This is a word of caution, not just forLM Studio, but really for Ollama and all,
all these projects is that, the easierit is to download the models, the more
you just want to experiment and run them,and then, you know, you start to max out
your hard drive space pretty quickly.
So, yeah, just be careful with that,make sure you have, plenty of room on

(44:07):
your hard drive before you click download'cause it's easy to get trigger happy.
Some of these things will belike miniature models that are
like two gigabytes and otherones will be like 70 gigabytes.
At one point I was trying to downloadthe, like the most hardcore uber DeepSeek
model, and it ended up being like a70 gigabyte download and I had enough

(44:27):
room on my hard drive and then I waslike, "Wait a minute, this computer
only has like 32GB of RAM. There'slike, I don't even think there's any
way that I can actually run this model.'Cause there's like, there's not enough
room in RAM for this entire model."I think that's how that stuff works.
So yeah, the, these tools are really greatand they make it way easier to download,
but, uh, you also gotta make sure youhave the system resources to back it up.

(44:50):
But yeah, I've vouch for LM Studio,it's a pretty cool, pretty cool project.
It's easy to use.

I guess that's a good time to point out the downsides of running
your LLMs locally, which is you have tohave, uh, or, or let me say it this way.
Your performance and your userexperience is gonna be very
dependent on your hardware.
So when you are accessing it on someother companies, servers, they're

(45:17):
running absurdly expensive hardware tobe able to process all of these huge,
high parameter LLMs, which is gonnagive you the most accurate results.
But when you're running themlocally, typically people are running
it on their laptop or desktop.
And unless you have some seriouslypowered, dedicated GPUs, you're

(45:39):
not gonna see great performance.
So just be aware that you aregonna see some slowdown if you try
to run these really large models.
However, as you saw when we werescrolling through the Ollama models,
there are much smaller models thatyou can use that are specifically
designed to run on consumer hardware.
And so you might get, let's say, 90% ofthe effectiveness of those larger models,

(46:02):
with much less resource requirements.
So give those a try and play aroundwith the right balance of model size
and accuracy and performance, and Ithink you'll find a good middle ground.

Yep.

Okay, one last thing we want to discuss.
We already covered Model LLMs forchat-based i nterfaces, but we
wanna talk about image generation.
We mentioned this briefly before,that Venice.AI could use it, and
then you have these more proprietaryofferings such as, um, again, I'm
blanking on the, the main one.

(46:36):
Stephen, if you could help me out there.

Oh, like for image generation?
Yeah.
I mean, you've got like, MidJourney

Mm-hmm.

You know, scenario and all those kinds of companies,

Yes, exactly.
but you might've heard aboutstable diffusion being an open
source model for image generation.
Well, there's a tool called StableDiffusion Web UI, that's made by the
same company that makes the stablediffusion models, and you can host this
and it allows you to interact with yourlocally hosted stable diffusion model in

(47:11):
the same way that we talked about with
the open web UI interacting with yourlocal LLM and this will give you that
same experience of being able to typeinto a Web UI, telling the LLM what
kind of image you wanna generate,and it'll spit out an image for you.
And so for those of you watching, wehave a screenshot here on the screen,

(47:34):
and this is showing how you can tweak thedifferent parameters and the different
inputs to what you want and it'll giveyou different images based on that.

Another one may be worth highlighting.
Here is ComfyUI.
We actually had a gentleman givea demo of Comfy UI at one of the
AI meetups at ATL Bitlab and.
It's still going throughpost-production, but uh, that should
be on the ATL Bitlab, uh, YouTube.
uh, I'd actually say you might de,depending on what your needs are, you

(48:07):
might actually want to look at this.
Like if you're more of an artist, designerwho's used to working and tools like
Adobe Photoshop and After Effects andBlender, Cinema 4D, all that kinda stuff,
this might be a tool you'd want to use.
Because, well it says right here,customize your workflow with custom nodes,
so it actually has an interface that'smore familiar to the graphics software

(48:29):
you might already be familiar with.
So it does let you downloadmodels, and run them locally.
And then you can create nodes inside ofyour interface, which is somewhat similar
to how you might structure stuff in likeBlender or Da Vinci Resolve, or, you
know, one of those kinds of programs.
I think even C4D as a node editor now.

(48:50):
And, uh, so yeah, you make up differentnodes for like, you know, run it through
this model and then a apply these tweaksand customizations to it and all of that.
So it's kind of trying to bridgethe world of like the, traditional
designer compositor effects artistsoftware with the generative art world.
But you know, if you're more on the,the developer, you know, side of things,

(49:13):
I think the, the stable diffusionweb UI Is also a great fit because it
kind of exposes to you and like oneinterface, like all of the different
like numeric parameters that make theart model function under the hood.

I don't wanna get too far off topic, but for someone like me
who is not experienced in, graphicalediting and that kind of thing, what
does a node refer to in this context?

So, well, it'd probably be easiest if I just
pulled up an image real quick.
Um, that, let's see, where should I go?
Right?
Uh, how do I get to Brave Search?
I'll just pull up an example of likeblender, um, uh, node editor maybe.
And like, like,

(50:06):
so an example here is we have this imagehere that I found, oh, it just went away.
Um, so I have this image up onscreen of a, a blender shader editor.
And basically like what we're trying todo is in this case, the person is trying
to make like this cool, like, I don'tknow, gold texture covered in fungus

(50:26):
or whatever to put on this statue here.
And to do that, they're like,they're starting here with like
some kind of texture and thenthey're running it through this
node right here, which changes thecolor to be more orange or gold.
And then they're running thatinto this, um, shader thing that
applies things like glossiness andreflectivity or, or lack thereof.

(50:51):
So it's kind of just like a way ofstructuring the data and this like step
by step like flow that's very repeatable.
Yeah, we're getting at a complicated,like graphic arts territory here,
but it's basically like, just kindof a common standard that in a lot of
advanced graphics applications, theyhave this like node based editing system
for getting what you want, where you,you start with inputs on the left hand

(51:12):
side, they go through all the differentnodes, and then you end up with a nice
looking output on the right hand side.

And for people listening, this image looks kind
of like a diagram or a flow chartwith, with different rectangles
all tied together in various ways.
All right.
Well thanks for that, Stephen.
That's all we have to coveron LLMs, hosting them locally.
If you would like to know more about thistopic or anything else AI, ATL Bitlab

(51:44):
actually hosts a monthly AI meetup.
Stephen, is there anythingyou'd like to say about that?

Yeah, it's on the second Monday of every month
as of the date of this recording.
But always check the ATL Bitlab websiteor meetup for the most current details.
But it's, uh, definitely, one ofour hottest meetups right now.
it's kind of a grab bag of differentstuff every month because there's
so much excitement about AI that somany people in the Atlanta area have

(52:11):
projects that they're hacking on,and so it ends up being, really fun.
Sometimes, we'll, you know, maybediscuss a paper that just came out.
Other times we'll have peopledemoing projects like one time
a guide demoed machine learningtraining, a pterodactyl skeleton.
Uh, he also had a car driving aroundon the floor, um, that was powered

(52:31):
by an LLM and we were typing incommands and telling for it to hunt
for a particular object in ATL Bitlab.
Uh, another time somebody demoed,um, how to jailbreak Claude.. Um, so
yeah, it's pretty, uh, fascinatingand you never know what you're gonna
get, but it's always a good time.

They can find out more about that at atlbitlab.com.
Remember, you can also email us atsovereign@atlbitlab.com and look
for The Sovereign Computing Showwhen you search for ATL bitlab in
fountain.fm or any other podcast player.

(53:09):
Anything else you wanted to talk about.
Stephen?

I think that's about it.
I might have some errata that,uh, I think of, uh, over the
weekend for, uh, next time.
But, I think I'm good for now.

All right.
Let us know what you think of the topic.
Do you host your own LLMs?
Do you use any of these private providers?
Are you using the non-private providers?
We would love to hear aboutit and discuss it more.
Thanks everybody, andwe'll see you next time.

Catch you later!
Hey, thanks for listening.
I hope you enjoyed this episode.
If you want to learn more aboutanything that we discussed, you can
look for links in the show notes thatshould be in your podcast player, or
you can go to atlbitlab.com/podcast.
On a final note, if you foundthis information useful and you
want to help support us, you canalways send us a tip in Bitcoin.

(54:02):
Your support really helps us so that wecan keep bringing you content like this.
All right.
Catch you later.

All Episodes

Sovereign AI: Using LLMs Without Sacrificing Privacy - The Sovereign Computing Show (SOV013)

Episode Transcript

Popular Podcasts

Stuff You Should Know

Dateline NBC

On Purpose with Jay Shetty

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Sovereign AI: Using LLMs Without Sacrificing Privacy - The Sovereign Computing Show (SOV013)