192 | Create AI images like a pro! Using New ChatGPT, Midjourney, and Open Source with Luka Tisler - Leveraging AI

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Isar Meitis (00:00):
Hello, and welcome to another live episode of the

(00:03):
Leveraging AI Podcast, thepodcast that shares practical,
ethical ways to improveefficiency, grow your business,
and advance your career.
This is Isar Metis, your host,and we've been talking a lot on
the podcast on one of the topuse cases of AI today, which is
creating.
Visual content for business usecases.
Now, if you are creating imageswith AI for either fun or

(00:24):
presentations, it has actuallybecome really easy to do.
You can do this across multipletools, but there's still one
main issue, which isconsistency.
And when you're creating.
Images.
In a business perspective,consistency is key because you
have either your brandguidelines or your logo or the
actual look of your product thathas to stay consistent.

(00:46):
Otherwise you cannot use, theimages that you're generating.
And so creating consistentimagery is.
A problem that requires skillsand knowledge and the right
tools in order to use.
Now, while mainstream tools,such as Chachi pt, or Midjourney
or Gemini are good enough on theentry level, there are.

(01:06):
Other open source tools that areactually providing a whole
universe of tools ofcapabilities that make different
things that are problematic withthe available standard tools out
there.
And they solve for that inreally beautiful ways.
So.
The open source world is verymuch more of a geeky kind of

(01:26):
solution.
People are like, oh, I don'tknow how to use it.
So today we're gonna demystify alot of the open source tools,
specifically when it comes tocreating imagery, specifically
when it comes to creatingconsistent brand relevant
product, relevant imagery.
Our guest today, Luca Tisler,has been in the world of graphic
design and visual content forbusinesses for many years across

(01:48):
multiple roles in multiplecompanies.
But in the past two years, he'sbeen focusing on helping
businesses, training businesses,and consulting to businesses on
how to implement AI basedsolutions for company specific
branded.
Content, which makes him aperfect guest to show us through
this process.
I'm personally played a littlebit with open source visual

(02:10):
tools, but not a lot.
And so I'm personally veryexcited and curious to see what,
Luca has to share with us.
And so Luca, I'm really happy tohave you.
Welcome to Leveraging ai.

Luka tisler (03:04):
Hey Isar, thank you for having me again.
really excited to be here andtalk about, imagery and creation
and open source.

Thank you.
Yeah.
Be before we get started.
First of all, if you are joiningus live, either on Zoom or on
LinkedIn.
So thank you so much for joiningus.
I'm sure all of you have otherstuff that you can do on,
Thursday, noon if you arelistening to this after the
fact.
So first of all.
Just so that you know, we dothis every Thursday at noon on
Eastern time, so you can comeand join us.
There's always amazing peoplelike Luca that are gonna share
use cases, and then you can chatwith people in the chat and get

(03:34):
to know other people, network,as well as be able to ask
questions, which you cannot doif you're listening to the
podcast, after it's been,released as a recording.
Also one last thing.
We are running the AI BusinessTransformation course for the
spring version already.
It's been running for two weeks.
It's been amazing.
There are 25 people in thisspecific cohort, and they're

(03:55):
learning things such as.
Generation, video generation,content creation, data analysis,
data manipulation, writing,basing prompts, et cetera, et
cetera, as well as how toimplement AI successfully
business wide, how to come upwith a business strategy for AI
implementation.
The next course we justannounced the dates is starting
in the beginning of August, sowe do the.

(04:16):
Courses at least once a month,but most of them are private to
specific organizations.
So if your organization needstraining, you can reach out to
me, but if you're looking for acourse, you can just sign up for
yourself.
Then we just announced the datesfor the next course.
You can go to our website orclick on the link in your show
notes and get to the sign up forthe course.
but that's it from me.
Now, let's give the stage toLuca and let's talk about ai.

(04:39):
Image generation.
If you have any questions andyou're here with us live, feel
free to ask them in the chat.
If you are listening to thisafter the fact and you're like,
oh my God, I wanna see this aswell.
So it's gonna be available onYouTube as well.
There's gonna be a link in theshow notes for you to go to see
this on YouTube.
But we will explain everythingthat we're doing and everything
that's on the screen so you guyscan follow us, even if you're

(05:00):
driving your car or walking yourdog or doing whatever it is that
you're doing, running on thetreadmill that you're doing
while you're listening to yourpodcast.
So, Luca, the stage is yours.

Thank you so much, ISAR.
so just a brief introduction, myname is Luca, of course, and
I've been, in business, businessfor the last almost 20 years.
I started as a video production,but then I quickly moved to
video post production.
then I moved to compositing,digital compositing, VFX,
animation, motion design.
So everything, visual,everything that's moving, it's

(05:31):
my domain.
so about almost three years ago,I found out this little thing,
on a discord called Mid Journey,and I knew immediately that this
is going to change everything.
half of a year later, I,resigned in my company.
In the company, I quit becausethere was not enough time for me
to learn.

(05:52):
so I spent a lot of timelearning and discovering, and of
course my journey was notenough.
I wanted more and more and more.
So the next progression was, ofcourse, stable diffusion.
Table diffusion is the firstopen source AI image, creation
platform kind of.
it's not as relevant as it usedto be because, there has been

(06:13):
some, advancement on that area.
I.
yes, sorry.
And also

serious leadership issues that almost got them
bankrupt and a lot of otherstuff.
So, so on the business side,they had an amazing
technological platform and theyweren't doing great decisions on
the business side of things.

Absolutely.
I agree.
I wish them well.
I hope their business is, backon track.
and, but they did.
A huge thing to the open sourcecommunity because they were
first that opened up the weightsfor 1.5 and SDXL models.
So this is image model issomething like large language
models.
people teach how to, well peopleteach them and they teach them

(06:51):
by.
Inputting an image and thedescription of that image, and
they put both as a pair into themagic box.
And they do this billions andbillions of times.
And all of a sudden now we havea machine that knows what is a
car, what is a banana, what arethe glasses, and can also draw
them and it's getting better andbetter and better.

(07:14):
I remember, the first tries werevery awkward.
the.
The pictures resembled the stuffyou prompted, but it wasn't that
right?
But I knew that this is going toprogress.
So it slowly, slowly started toprogress from 1.5 to 2.0 now as
the Excel, and then three and3.5.
Just before they released 3.5,there was a new kit on the block

(07:38):
called Flux.
People went berserk because itwas truly a model that
understood natural language.
So you could start prompting asyou speak.
And this was amazing.
This was incredible because noimage generator, understood
natural language.
We had to talk to it.
With tags, so like, um.

(08:01):
a city, cyberpunk, dawn, a carreflections, good quality and so
on.
And, you know, we just had ourfingers crossed to get those
images as best as possible.
But now, I mean, you can talk toall of them in natural language.
Do this, do that.
Give me this, give me that.
So, the progression is amazingand, there has been also the

(08:24):
proprietary models.
The rates in quality.
So now we have like imaging.
We have Ideal Gram.
we have many, many, many imagemodels that are absolutely
gorgeous and they produceamazing images.
And the best thing is that withthe proprietary models, you
have, control to an extent.

(08:44):
So, you can control yourimagery, but most, by prompting.
but open source has somethingmore.
And, it started with stablediffusion, and it's called
control net.
So those are kind of differentlytrained models to guide, the
imagery you want it to looklike.
So, for example, you have toknow how latent diffusion

(09:07):
technology works.
I'll, I'll keep it very shortand simple.
So it starts with noise, justnonsense, complete nonsense.
And then it starts, putting thatnoise away.
And with each step, it, takes abit of noise away and inject
some of your ideas.
That is conditioning AKAprompting.
So, imagine like you are lyingon the field and staring into

(09:31):
the sky, watching clouds, andyou can see a shape of a turtle
in a cloud and you decide, okay,that's a turtle, right?
Latent diffusion works insimilar way.
So it kind of says, okay, I cansee that inside.
I'm gonna shape that into theimage that I already know.
when I go to my magic box andsee what's, for example,

(09:51):
sunglasses.
So, aha, okay, we havesunglasses Now I will shape this
noise with conditioning, AKAprompting, and we will get a
nice result.
So stable diffusion.
Opened up the models so we can,create our own models.
We can teach our own models.
we can fine tune them and we cancreate Lauras about this a bit

(10:12):
later.
but now this noise hasconstraints.
now all of a sudden you can.
control it by, let's say, withthe post.
So if you prompt, the man hashis left leg and right hand in
the air, if you prompt us, youprobably won't get.
That's, specific results, right?

(10:33):
Or you will have right hand orleft hand.
Maybe your hand will have sevenfingers and so on.
But with control nets, we guidethis noise.
So we say to the model, okay,here is the image.
Of the man with in this positionand via control net, we transfer
the post to our image.
So we get, the result that weactually want.

(10:56):
And, post is just one of thosemany control nets.
We have lots of control nets,and they're different for each
use case.
Some of them are good for, likeI said, poses, some of them are
good for architecture.
Some of them are good for, well,a lot of use cases.
and they're getting better andbetter and better.
yeah.

What else?
Yeah.
So to add my 2 cents to what youjust said, to connect a few dots
of the things you mentioned.
One is, how these models aregetting trained.
The way they're getting trainedis they took gazillion images
and then noise them step by stepby step.
So what the model is getting isgetting the final.
Photo that is an existing imageof anything you can imagine.

(11:36):
And then it gets another levelwith 2% noise and another level
with 6% noise and another levelwith 8% noise all the way to a
hundred percent noise.
So that's how they teach themodel to basically then reverse
the process to de-noise from ahundred percent noise to an
outcome because it has seen.
Multiple noising levels of anyimage that you can imagine.

(11:57):
So this is how it works.
And the control nets allows youto guide the process beyond the
prompt itself by giving itreferences that it understands
whether about position lighting,reflection.
Outlines of things, likeliterally anything you can
imagine that is a part ofcreating an image, you can then
use, think about it as a, anadditional reference for the

(12:18):
denoising process or for thecreation process.
It doesn't really matter how thecreation happens, you're just
giving it more references thanthe written reference that you
had in your prompt.

Exactly.
that's a very important partand, I'm sorry I skipped it, but
it is, noising and denoisingprocess.
It goes both ways.
So you are 100% right.
so, but not just open source hasa control.
Lately we've seen thedevelopment, also in proprietary
models.
For example, mid Journey hasthis amazing thing called omni

(12:50):
reference.
and you can actually.
put in the platform, the imageyou want to recreate, and it'll
recreate it very, very good.
you can input your character sothe consistency is kind of
solved.
We'll probably never have 100%consistency, but I mean, it
depends what you're workingwith.
most of AI lands on, digitalmedia.

(13:14):
and on digital media, we usuallyhave smaller screens, right?
And, the human eye just says,for example, Robbie Margot.
Robbie Margot is very, veryfamous in AI community because
everybody is testing with herface.
And, if you see a person similarto, Robbie Margot, you will just
tag her as Robbie Margot, andthat's it.

(13:34):
It doesn't have to be 100%consistent.
It's good enough for our brain,but on the larger scale, it's
just not good enough.
Then we have to use Photoshop,then we have to use in painting,
Well, we get about 80 to 85%done with ai, but the last 10 or
15% we have to work on manually.
And this goes for all imageryactually.

(13:56):
if you want to bring yourimagery to another level, you
will have to add the human touchin the process of, the creation
of the image.
So Photoshop is not goinganywhere.

for now, I'll say something about the whole
Photoshop thing and then we candive into the actual examples,
but Photoshop is not goinganywhere for now.
I agree.
I think what Canva did tographic design on the lower
levels AI will do to Canva,because my gut feeling tells me
that these two universes aregoing to merge.

(14:30):
And you might still use Canva,but you will use AI in Canva
versus dragging and dropping andusing templates.
You will have an idea in yourhead and you will request it,
and it will be generated on thefly.
And I think what's gonna happenis features that you currently
have in Canva will migrate intoChachi, PT Gemini, mid Journey,
and so on.
Where you will understand layersand will understand text, you

(14:52):
will understand grabbing acomponent of the image, filling
out the background where youmove the thing from, in a very
intuitive way.
And there is very small doubt inmy mind that's gonna happen in
the next few months, meaningthis merging of design and.
An AI image generation to oneunified environment.

(15:13):
And I do think that the only gapbetween that and the
professional world is someadditional tools that just don't
exist in the simple tools.
And I think it's just a matterof time, until they're there as
well.
So I'll add.
Two things that are very obviousto people who are in the
professional side.
One is upscaling.
So like you said, if you wannaprint a billboard that will
cover the side of a building,the resolution that midjourney

(15:36):
gives you is just not goodenough.
But there are already amazingupscales today, and once it's
gonna be built into Midjourney,then that problem is solved.
and the other is.
Really small fine things onspecific textures, specific
fonts versus just random fontsand things like this.
masking of specific aspects.
Those of you know what I'mtalking about.
And so I assume these universeswill merge together and there's

(15:59):
gonna be more professional toolswith AI built into them.
like.
Photoshop, which already has,you know, their version of AI
built into it.
And there's gonna be the morebasic tools that are either
gonna be Canva or working in theactual tools themselves, such as
Gemini, me, journey, et cetera.

Absolutely.
and we can also see the rightsof agents.
So you don't prompt anymore.
You talk to machine.
Yeah.
So do this, do that, changethis, change that.
And I think this is one of thebiggest, things that is
happening in AI right now.
because we won't be, Producingimagery and videos with our

(16:36):
mouse.
We will guide the model with ourlanguage, with our speech, and
all the changes will be instant.

Yeah.
shall we jump into examples?

yeah.
Yeah.
Um.
what do you wanna

see?
I think it would be interestingto see, first of all, for people
to see like control net, andwhat exactly and how it works.
So just to give examples ofthings.
And I think doing the same thingin mid journey would be useful
as well.
So showing kind of like the omnireference and how that works.
And I think.
Both these things will showpeople stuff that they may or
may not know how exactly, itworks, and maybe they haven't

(17:13):
experimented with that before.

Absolutely.
So for example, here is themidjourney, and, you can, well,
I just created this mascot, thisimage for, a language school
that I'm working with and Iused.
let me see.
I

used, so for those of you who are not seeing as
Luque is, searching for those ofyou just listening, we're
looking at Mid Journey, which isone of the better AI image
generation tools.
And we're looking at like aYellow Frog or lizard that is
teaching in a classroom, as themascot.
And it's wearing like a bluehoodie, with the logo of the

(17:52):
school.
I think what we're going to seeis how it was actually created.

I don't have this image by hand right now.
Let me just close the LighthouseAcademy website.
Okay.
Subtle, subtle.
I, yeah.
But let's try to createsomething with Margo Robbie,
scratch.
first of all, no one prompts.
From their head anymore.

(18:17):
So everybody's using lms.
So, give me an, an image promptfor a woman in red dress.
posing in front of, EiffelTower, and we're going to make

(18:41):
it, editorial

very quick.
So again, for those of you'renot seeing, we're writing this
prompt in Chachi pt, and it isgoing to give us the prompts,
better prompts with a lot moredetails and qualifiers, than if
we wrote the prompt ourselves.
By the way, there are multiplecustom gps already created and
available that are very good atthat, that are built to give you

(19:06):
highly detailed prompts for,image generation.
But as you'll see in a minute,even just writing what Luca just
wrote will give you a very longfourth sentences worth of
details in a prompt that you canthen paste into whatever it is
that you're using.

All right, so our prompt is, I won't read it
because it's just too long.
But anyway, we will set settingsto, let's say 16 by, oh, let's
go with three to four.
Let's raise stylization andvariety.
everything is okay Version.
Ooh.
I experimented with Mid Journeyversion three.

(19:44):
Okay.
Stylization a bit higher and weare good to go and press Oh no.
We will, import our image that Itook from Internet of Marco.
Robbie, We didn't say anythingabout Margot Robbie.
We just said, a woman, a stylishwoman.

(20:07):
Yeah.
And we dragged and dropped omnireference, into omni reference
and omni strength.
You can, I.
Use it from zero to 1000, I willuse the low value about 300.
And now, oh, we can also, insertimage prompts and style
references.
So if we don't know how todescribe the style, we can just

(20:29):
drag and drop the image of astyle we like or.
Use image prompt so we kind ofget the composition that we
want, from another image.
Okay.

So again, just to explain what these are, going
back to our initialconversation, when you are now
prompting these tools, inaddition to the written prompt,
you can drop in images and usethem for different purposes.
So you can use an image as areference for a person, a
reference for a style, areference for the composition of

(21:03):
the actual image itself, areference.
and when we say style, thiscould be.
Very broad, like cartoonishversus realistic, but it could
go into way more detail, likethe color palette that is going
to use and so on.
and you can do all of that,definitely mid journey that now
has it broken up into differentlevels and tools.
I.

And here we are.
We get a woman that is similarto Robbie Margo, but she is not
Robbie Margo.
Yeah.
because our reference was setvery low, but now we can use
everything and we can up theomni strength, so our woman will

(21:46):
be.
similar, a bit more to MarcoRobbi.

So again, for those of you who are not watching
this, there's a slider next tothe image that you upload and
you can move it left to rightbetween zero and a thousand.
You can also enter the parameteras a parameter, but they mid
journey now moved it to make it,more user friendly where you can
just move the slider around, andyou can control how much weight.
The image that you uploaded willhave in the output, and if you

(22:14):
bring it closer to a thousand,it's gonna be very similar to
the image you uploaded.
again, just to explain to peoplewho are not watching, the image
that we uploaded is just a face.
You can't see the entire person.
the.
Prompt is for a woman in a reddress.
Hence we see the entire womanand it still knows how to pick
the head of the person from theother image and apply it to the

(22:36):
full body shots that we're doingright now.

So the thing with Mid Journey is after the prompt,
it sets a couple of parameters.
And you have to be very careful.
So if you change the parameters,you always have to delete your
parameters at the end and usethe new settings that you are
giving it.
Because right now we.
created four more images, butthe omni reference weight was

(23:05):
left at 300.
So because, preferences behindthe prompt are stronger than the
preferences that we use throughthe website.
So each time when we, change ourpreferences, we have to delete
those from the prompt and set.
Everything else through thewebsite and now omni reference

(23:27):
is high enough and we will get,a woman that looks like Margot
Robbie because we created twosets of images.
But, I didn't set the omniweight, high enough.
And while you are waiting, wecan actually check out Chad GPT

(23:50):
because Chad GPT has introduced,images in March, I think.
And now we can ask it to, createthe image prompt.
So we will use the same prompt.
So here is the prompt and I willask GPT Create an image based on

(24:11):
this prompt, and I need a widescreen aspect ratio.
So GPT images are usingdifferent kind of, architecture.
if we check out how, oh, thereit is.

(24:35):
And yeah, she looks much morelike Margot Robbie.

Yep.

So, as we said at the beginning, the latent
diffusion works with addingnoise.
Yeah.
well, Chad GPT uses differentkind of structure and it's, I
don't know how it works, butit's adding the details from up
to bottom.
Yeah.
Yep.
You can't communicate withMidjourney like communicating

(25:02):
with GPT because MIDJOURNEY isnot large language model.
it knows images, but the imagesthat it's creating are much,
much higher quality than imagesthat we're getting from GPT
because GT's first priority islanguage or words.
they also did a good jobtraining it to, create images.

(25:22):
But, it's not as good as myjourney quality wise.

there's an interesting question while we
wait for Chachi pity to generatethe image.
you know what I'll touch beforeI jump to the question I wanna
kind like go back to what yousaid as far as the differences
between Chachi PT and MEjourney.
And then we're gonna go to like,control net and stable diffusion
or maybe comfy y not to scarepeople too much away, but pros
and cons of the differentsystems.
From a professional approachperspective, mid journey still

(25:52):
provides better results.
yes.
From a day-to-day usageperspective, Chachi PT generates
good enough results in manycases, and to get to these
results are easier.
Why?
Because it understands contextand you can provide it a lot
more information that you can,because it's a complete
conversation.
It's not just, here's the promptto create the image, and now I

(26:14):
wanna change something in theimage.
I need to go and write.
A completely new prompt thatwill start from scratch.
Basically, it understands theconversation, understands the
context.
You can upload your brandguidelines to Chachi PT as an
attachment, and you will knowhow to use them in the image.
which majorly.
Does not know how to do becauseit does not know how to read PDF
documents.
And so things like that arethings that are benefits of

(26:35):
using Chachi pt.
another advantage of Midjourneyis speed.
Midjourney generates four imagesin about 10 seconds.
Chachi PT creates one image inabout a minute.
And so if you want to iterate alot, doing it with is gonna
work.
A lot faster.
and as I mentioned, me, journeyhas built a lot of tooling
around its image generation withdifferent sliders and bars and

(26:56):
controls of different things,and being able to reference
previous images and being ableto reference previous prompts
and being able to add all thesedifferent things where that does
not exist on the Chachi PT side.
if you're a beginner, I thinkstarting with Chachi, BT is
easier.
If you're a more advanced user,you can get better results with
using midjourney either way.

(27:16):
Both of them, I think for theaverage user, for generating
images, for presentations and orbasic social media stuff, both
tools are definitely adequate.

Yeah, absolutely.
And you're spot on with addingGPT, different pieces together
and it knows how to createimages.
with those pieces, right, youcan upload the, let's say, an
image of a chair, image of awardrobe, an image of a bed, an
image of a picture, a painting,and you drop.

(27:49):
those images in Chad, GPT andsay, create a room out of these
images and it'll be spot on.
It'll be perfect.
But as you said, it has itslimitations and for professional
usage it's not good enoughbecause we need to create images
fast.
And, Very, very, high quality.
Yeah.
So Chad g PT is just not goodenough.

(28:10):
also, nor Chad, GPT nor journey,cannot batch render images.
Yes.
So you can't just say, create100 images based on this prompt,
and I will select the perfectone, while, open source tool can

do that.
So let's really jump to that.
Let's jump to maybe Confu Eyejust to show people what it is
and I know it might scare peoplein the first minute, but I think
we can explain what it is andhow it works and maybe demystify
it a little bit.

yeah, this is my, workflow.
It's a bit more advanced.
We can check out the, where isa, so again,

for those of you who don't see the screen, what
comfy UI is Think about a flowchart of a process where every
step of the flow chart, you havemultiple tools that you can
connect that.
Impact how the image will begenerated.
So it's not post-processing,it's actually how the image will
be created.
And those of you who can see,you can see there's like, I

(29:08):
don't know, 20 different boxeseach one with different
components and lots of linesthat com combining them.
That looks like a web ofdifferent things.
But each and every one of theseboxes adds another layer of
control over the.
Process on how the image isgonna get generated.
And by combining them together,you can be a lot more specific

(29:29):
with how the output would looklike.
Going back to consistency, whenyou can control the lighting,
the angle, the pose, thegraphics, the style, the.
Entry images, the size, the likeliterally every aspect of how
the model actually works at themodel level, you will get a lot

(29:50):
more consistent outputs.
Which going back to what Lucasaid, if you want to now.
Mimic a photo shoot where you'regonna bring a model.
Let's take your example of a, agirl in a red dress in Paris, in
front of the Eiffel Tower.
you're not gonna take twopictures if you're doing an
actual photo shoot.
You're gonna take 250 50pictures and then pick the two
best ones and then work on themin Photoshop, and then pick the

(30:10):
final one based on that.
And.
You can do that in confi I withno problem with a much higher
level of consistency and with alot more control of exactly how
it's gonna look like.
So think about a professionalphotographer.
You don't just take randompictures, you set up the camera
to the right setup with theright aperture value, with the
right lighting, with the rightcamera, with the right.

(30:32):
ISO parameters, like all thestuff that you want to control
in the image you will set up inyour camera.
You don't just randomly shootand click.
And this is kind of like whatyou can do in the digital world
with comfy UI and open sourcetools.

that was so beautifully said.
I can give another comparison.
So, if you just want to use yourcomputer, you will probably buy
Mac.
Yeah.
But if you want to know everynitty gritty detail, you will
buy this computer by parts andyou will assemble it for
yourself.
Yes, because you have controlover each part.

(31:08):
What part, what is thecompatibility, where you will
put it?
And you know this thing byheart.
Not just opening a computer,start working, but you know
everything that's happeninginside, every process.
So Kafi is the same.
For example, here is a notewhere you define the checkpoint
or the model you will be using.

(31:29):
This is the Laura so the notefor specially trained small
models that you can, trainyourself.
I.
we have clip loader, we have,VAEs, we have resolutions.
We have so many things that youcan toy with and see what are
the results.
So I would say that comfy is forpeople that want to know more

(31:49):
and.
Most of all are very, verycurious because if you don't
have curiosity in your genes,under your skin, then this will
be just too overwhelming.
I understand people who don'twant to use comfy.
They just want to produce, but,well, I need a bit more.
I need control.
I need to know how I will.
construct the image and I canchange the schedulers.

(32:12):
I can, change the samplers.
I can change anything and I caninfluence my image based on all
these parameters.
now in my journey, you have acouple of parameters here.
You have thousands, and you cancombine them, and you can do
well a lot of things.
But what we see right now.
It's a fairly complex, not toocomplex, but fairly complex

(32:36):
workflow.
So maybe it would be better if Iwould show you one a bit more
simple.
Exactly.
Yeah.
So for example, let's take acontrol net workflow, and I have
a couple of missing notes.
That's okay.
not notes, but, yeah, so let'sjust switch a couple of stuff.

(32:58):
VAE.
not prune, but yeah, that'sokay.

So again, for those of you who are not watching, now
we're looking at a process thathas four or five different nodes
versus the.
30 that was on the screen beforenodes are basically building
blocks.
So think about them like Legos.
So you can build a simple Legowith a bunch of parts, or you
can build a really complex Legowith multiple parts.
And the more you know what theparts do, the more you can use

(33:34):
more Legos to build moresophisticated stuff.
So you don't have to start withlike the 30 step process.
You can start with a four step.
Process and build something thatwill be a lot easier to
construct, but still willprovide you a lot more control
than doing it in, let's say, midjourney on Definitely on Chachi
pt.

Correct.
So what we did now is, okay,everything is ready, we can run
it.
So I inserted the imagereference.
and I expect my new image willhave a person that stands
exactly the same as the woman onthe photo, but I have to give

(34:13):
it.
So,

so again, This is a tool that is not for somebody
who's just wanna play around andcreate images.
This is a professional tool thatif you do this for a living, you
can do this.
Or like Luca said, if you'rejust really curious about image
generation and you want toexperiment with.
Doing things that are beyondwhat the image generation tools
can do, you can do that as well.
And again, the cool thing hereis that you literally control

(34:35):
every single thing.
And one of the cool things thatwe talked about several times in
this episode already is controlnets, right?
So as an example, you cancreate, use a.
Model of a person or a cat or ananything, and then have the
output resemble it in whateverway you want.
So either resemble it in thetexture or resemble it in the

(34:55):
fur that it has, or resemble itin the pose that it's standing
or walking or jumping in and soon.
And it will know how to do thatbecause there's one component
that you control, which is.
The pulse of the person or thetexture of the image and so on,
where it will follow, what thecontrol net tells it to do.

Right now, control net is not working.
I don't know why.

Yeah.
The beauty of live demos.
So Let's do a quick summary, ofeverything we talked about.
Sure.
And then we'll see if you haveany final thing to add.
Yeah.
There are really three mainchannels today, right?
One is the more.
Professional by yet readilyavailable Proprietary models
like Mid Journey.
There is the open sourceuniverse that has tools like

(35:42):
Stable Diffusion and Flux thathave multiple ways to use them,
including through com v ui,which just starts another layer
of control over just using themodel itself.
But you have to have an opensource model to do that because
literally what Comfy UI does isit controls.
A huge number of parameterswithin the model itself.

(36:02):
That literally what it allowsyou to do.
And then there,

there are, if I may interrupt you, there are two.
Yeah, yeah, for sure.
First of all, you need apowerful computer.
Yes.
You need a graphic card that hasa lot of VAM, so your basic.
graphic card is not good enough.
it has to have at least eightgigabytes of VRA or more.
The more, the better.
and the second one is a steeplearning curve.
So it's not something that youjust toy with.

(36:27):
You have to invest time to learnhow to use those tools.
But when you use them, when youstart using them, you become
invincible.
You can create anything youwant.
If, of course things areworking.

Yes.
Well, I think they're working.
It's exactly the same thing as acomplex machine, right?
The chances of getting one thingwrong and then the output is not
exactly what you wanted.
The more complex the machine,the more chances you're gonna
get something that is not, whatyou meant.
and then really the othercomponent is going to a tool
like Chachi, PT or Gemini, withboth can create.
Good enough images for manyday-to-day things, but with a

(37:01):
lot less control, a lot lessparameters, a lot less
capabilities to know exactlywhat's gonna happen.
For many daily use cases, likecreating presentations or an
image for a social media post,that is the easiest way to go.
If you are a professionaldesigner or if you need to
create stuff that is moreconsistent and so on, going the

(37:21):
open source path, we're justgonna give you a lot more
capabilities.
Luca, if people wanna followyou, learn from, you know, more
about what you do, how you doit, hire you, what are the best
ways to do that?

well in social media I'm not spread around.
I mainly use LinkedIn andInstagram here and there.
So you can meet me on LinkedInand, send me a message, connect
to me and, if you have anyquestions, I will be happy to
answer them.

Awesome.
Luca, thank you so much.
I think this was very,educational and valuable to
people.
I think most people.
At least people that I know, andI know a lot of people who are
playing with AI do not know alot about the visual site in
general, and definitely notabout the open source models and
how they differ and whatbenefits they provide.
So I'm sure this was very,helpful to people.

(38:10):
Thank you so much.
To everybody else who has joinedus.
I appreciate you being here.
I appreciate, spending the timewith us, and, being active in
the chat.
And until next time, have anawesome rest of your week.

All Episodes

192 | Create AI images like a pro! Using New ChatGPT, Midjourney, and Open Source with Luka Tisler

Episode Transcript

Popular Podcasts

Dateline NBC

The Breakfast Club

Crime Junkie

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}192 | Create AI images like a pro! Using New ChatGPT, Midjourney, and Open Source with Luka Tisler