How To Turn Text Into Amazing Music (ft Phlo)

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
AI might Co produce the perfect soundtrack for your life.
We've always believed that musicwith the most human form of
expression. So why does AI generated music
move us to tears? Technology, once seen as mere
imitation is now evolving into acreative partner, amplifying
musical ideas for everyone. Forget eating, years of training
or $1,000,000 studios, AI is eliminating both barriers
overnight. These tools are letting anyone

(00:21):
translate their vision, their mood, their craziest ideas into
sound. What happens when the gap
between what you can imagine musically and what you can
create shrinks to almost nothing?
Welcome to episode 38 of Tool Use, the weekly conversation
about AI tools and strategies toempower forward thinking minds
brought to you by Anetic. I'm Mike Byrd, and this week
we're joined by Flo, A musician,tinkerer and staple in the AI
engineer community. Flo, welcome to Tool Use.

(00:43):
Thanks, man. Thanks for having me on.
I appreciate it. Absolutely.
So do you want to tell me about your background, how you got
into AI? Yeah.
So it mostly comes from me tinkering out with Python,
licensing music online. And a huge part of that process
is very tedious. You're doing a lot of the same
tasks. Uploading.
Artwork and, and music to multiple websites and, and then

(01:04):
like for me, it was like also launching advertising around
those that, that music and so much of that process like it was
like taking me longer to upload music than it was to actually
create the music. So I started tinkering around
with Python. I asked around, I had a couple
of technical friends and they told me Python would be good for
a scripting stuff like that. So, so luckily, like the
language of, of AI mostly is, isPython.

(01:26):
So I had a little bit of a head start, but but essentially I
actually had something prepared for this.
I wrote a tweet. I'm almost coming up on my 2nd
anniversary where I essentially joined the Layton Space podcast
paper Club. That's really where I kind of
like cut my teeth and learned how to decipher kind of like a

(01:48):
lot of this fluff or hype from like true machine learning
models and, and like the stuff that really matters, I guess is
what I'm trying to say. So it was, it was that and then
I also joined a community callednews research before it was
called news research and also learned lots like fine-tuned
models and, and again, that's kind of like where I got my
start. And this was around August of

(02:09):
2023. I was paying attention to the AI
space before that was really into TTS models, voice cloning.
I was also into, they didn't really have music generation,
but they did have where you could generate drum patterns or
loops, if you will. And I had been paying attention

(02:31):
to that since early 2023 along with ChatGPT as it released.
So, so, so I had been paying attention to the face, but I had
not really dove in until around this time almost two years ago.
Like it's kind of how I really got into the AI space and
started like creating content and, you know, I guess being
known as an AI engineer. Love that man.

(02:53):
Yeah. And you mentioned two of my
favorite online communities. Discord late in space has like a
lot of alpha in there every Wednesday.
The Paper Club and Noose is justone of my favorite organizations
pushing the boundary in open source and really just trying to
make sure it's disseminated to everyone.
Let us all benefit from it. So you mentioned like the the
evolution of paying attention totext to speech and other types
of aspect to music generation. How is music generation evolved

(03:15):
with AI? Has it been like one of those
things like the ChatGPT moment we just exploded a nowhere or
have you noticed like a gradual progression with it?
Yeah, I definitely think SUNO and Audio coming online and
making their services available was like the ChatGPT music
moment for music at, I think Stable audio Open.
That model also being openly available that you can fine tune

(03:37):
and download the weights for wasanother moment where I saw a lot
of people tinkering where they would not have.
I think that was a special moment for me because I got to
tinker a lot myself at that time.
But I definitely think in terms of like mainstream people really
paying attention to the space, those were definitely big
moments. And I would also say, I think I
was a bit early, but what one ofthe moments I really piqued my

(03:58):
interest was there was a guy named Roberto Nixon who showed a
demo of how you could or how he was cloning Kanye West's voice
at the time. And he made his own like cover
song or like, it was a really, really an original song using
Kanye West's voice. And he showed the entire
process, the model that he used,where he downloaded it from, how
he trained the voice by getting the audio clips of Kanye West.

(04:20):
It was insane. That's what really, really
piqued my interest a couple years ago.
And, and, but but I would definitely say the ChatGPT
moment of music was Suno. Suno and Udio kind of like
competing back and forth, you know, lowering their prices,
like making sure they offer, youknow, different features are
definitely huge moments. I will also say because I
thought about this recently, I think that's the music space

(04:45):
probably is not as profitable asas like just regular text models
or multimodal models. And I think the progress there
is a lot slower. I also think there's all kinds
of like, I don't want to say moral dilemmas, but there's all
kinds of, I guess the word I'm trying to think of is like

(05:05):
there's a lot of uncertainty around what it means to train
these models and if they're using copyrighted music or not.
There's like some lawsuits goingon right now.
So I, I think the progress is just a lot slower because people
are, are like tiptoeing around, for lack of a better word.
But but, but in terms of like it, the, the progress initially
was like an explosion because while this is a new technology

(05:27):
and I, I would say that they're,they're more so kind of settling
into how, how do we integrate this AI with the tools that
already exist to create music. Is, is kind of like, I think
where we're headed. But, but as far as progress goes
like that, that was definitely ahuge moment for me.
Yeah, yeah. Getting everything integrated is
going to be massively important because it's easy for people to
kind of go use one tool, get onething.
That's how I kind of use Suno. But being able to enter into a

(05:48):
workflow, say you want to actually, we'll touch more of
that later. What I am curious about is the
the current state of AI Gen. music because we have we've
talked about Suno making a complete new song from scratch.
But also remember the first timeI heard, I think it was Frank
Sinatra singing Lil Jon. And I was just like, this, this
is incredible because one of those moments were like, oh man,
like all of a sudden every beer just got broken down.
And we can turn any song that wewant into existence.

(06:09):
So what what are the different avenues people can play with to
actually explore like AI generated music?
I, I would say the the most popular tools right now.
I, I think Suno actually has like is, is more popular,
popular than Udio, But those arekind of like the main tools that
you want to go to for it's accessible to anyone.
You go there. If you have a Gmail account or
an Apple account, an e-mail account, period.

(06:30):
You can go sign up for these services for free and generate a
couple songs every single day. So those are the services I
think are most accessible. If you are, let's say you are
already a producer or an artist,a musician and you want to be
able to feed some of your own creations into those services,
you can. But I think where the real
control comes in, like if you'realready used to working with a
digital audio workstation like aLogic Pro or a garage band or

(06:53):
AFL studio Ableton Live, I I think you, you also could
benefit from really playing around with some of the open
source models. And I think, yeah, so so the
open source models that have listed here are stable audio
open. We have the way.
So this foundation or this research lab that created this

(07:13):
model says that it's pronounced yeah, even though spelled YUE.
But but this is another model that you can kind of play around
with on your own hardware. But you, you can download the
ways and play around with it and, and fine tune it at home as
well as a diff rhythm is anothermodel.
So those, I think those are the outlets for if you're hardcore
and you're really into this space and you do have good

(07:35):
hardware. When I say good hardware, I mean
typically like an NVIDIA GPU or even the, the some of these like
stable audio open is also compatible with MLX.
So if you have a really beefy Mac machine, a MacBook Pro that
has a lot of unified RAM, you also can run these models at

(07:55):
home. But I would say to kind of refer
back to the initial answer, the most accessible is going to be,
I would say Suno Udio Stable Audio actually has an online
version, but it's mostly again, for producers, truth be told.
And then Refusion is a new modelor service that kind of like
stepped onto the scene that is really, really useful.
So, so I think those are the places where you can kind of

(08:18):
like go get into these things. Nice.
I've, I've only used to know personally refusion I'm hearing
about for the first time from you.
So would you mind kind of going through the difference between
these different things? Are they just kind of like the
same thing repackaged? Do they have different strengths
or weaknesses? What would drive someone to
choose one over the other? I, I would say the way that I
think about Udo versus Suno specifically because of the top
2 is like Udio's more like a songwriting partner in my

(08:40):
opinion. The way that the, the, just to
answer the question initially, they're pretty much all the same
thing. You type in text or you deliver
audio and, and then you get backaudio is essentially kind of
like how it goes. And sometimes you can combine
the two. You can combine lyrics with like
a guitar lick that you played or, or a piano riff that you
created on your own, or you can provide just the lyrics or you

(09:01):
can provide literally a simple text prompt like, hey, create a
birthday song for my daughter. Like all but all of them at at
their core are the same. The mechanics behind them, the
UI are a little bit different. So that's where you get into
like personal preference and, and that's what I'll kind of
explain here. But like for the most part,
they, they are the same deliverables, if that makes

(09:21):
sense. But but the models behind them
are different. And, and that's the thing too,
like with these proprietary services and models like Udio,
Suno and Refusion, they do not make it.
And I think for fear of litigation, to be honest with
you, but they, they, they do notmake it public.
What exactly their software stack is, especially when it
comes to the models, but there'sa lot of speculation around

(09:43):
them. Suno in particular was created
by the same research lab that created Suno Bark, as well as
Chirp, I think is the name of their other model.
So you can kind of like do a little bit of background and see
where where these things come from.
But what they are today is not publicly available right right
now, at least in terms of the models themselves.

(10:05):
But but this is how I like to think about them.
I break them down as like Udio is more like a songwriting
partner in my opinion. It it, especially if you're in
the free tier, if you're not paying for an Udio account, it
generates 32nd chunks of music. I think they actually have an
experimental model that's available now, even if you don't
have a paid account. But anyways, that's pretty much
the core of Udio is like it it you kind of create and craft the

(10:30):
music as you go with the model, in tune with the model, if that
makes sense. Where I think of Suno more like
an in house music producer. If you are a songwriter or
someone who doesn't have a lot of experience with crafting
music or maybe you don't know the technical terms to describe
BPM's and like the key that the song should be in our core is
how to put those together. Suno pretty much takes care of

(10:52):
that for you. And the way that I think of or
the way that I hear the music that comes out of SUNO,
especially these days, is it very like Top 40 ish polished?
I think you have to go above andbeyond and go out of your way to
get it to create anything that'slike very grungy or, or
unpolished, if that makes sense.And, and recently they, they've
improved the song structure and,and some of the timing of, of

(11:15):
SUNO. But but anyways, that's how I
think about SUNO in terms of usecases when I'm looking to create
music. And the Refusion I think is like
an AI sound designer. The the UI is actually very
interesting, but I think the model behind it creates a
spectrogram of music initially as it's generating the audio.

(11:38):
Whereas Suno and Udio are eithertransformer or diffusion based.
So I think Refusion is also diffusion based, but they just
go about creating the music a little bit differently.
So, well, for all intents and purposes, you, you, you log into
these services, you have a little box where you type in
some text or some lyrics and maybe a, a mood for the song and
in a genre and they, they, they spit out music.

(12:00):
So they are, you know, essentially the same in, in that
sense from, from a user experience.
Nice. And yeah, that aligns with my
experience with Suno because I have very little musical
background and I kind of just hop on and say, hey, tonight the
the, it's a hockey game. At least the Canadians are
playing and my dad's one anotherlike make a trash talking song,
RIP on Montreal. And then I just, you know, make
it rock. And then I send it to him and
it's done. And it's something that's

(12:20):
enjoyable fun. It has no real longevity to it,
but it's just a fun little piece.
There was also a Suno song I heard, I think it was on one of
their their Creator Spotlight libraries or something.
And it was a mix of blue ska andreggae.
And there was a portion of it whereas like a live performance
and the audience actually sung aline.
And that blew my mind because itwas just a completely different
paradigm from just having a single singer, maybe even like a

(12:43):
duet go through. And this was just something that
was completely out of what I expected to be.
What other types of either unique features or control do
people have over the music they create through this?
More like introductory level of music generation.
Yeah. Suno is it greatly improved this
like so I remember saying this on stage that when I did the the
workshop last year that like creators are going to want more

(13:07):
even even people without the music experience like who maybe
don't have a music background orare going to want more control
than the models give right now, than the user interfaces give
right now. And I think Simo's done a great
job of like creating. I don't think there's a it's a
full digital audio workstation in the sense of like an FL
studio or Ableton Live, but they're they're getting there.
They're getting much closer to where you generate the song

(13:29):
start to finish. And let's say you, there's one
piece that you don't like, like a lyric you want to change or a,
you want the key to change on a certain part of a song or
whatever it may be like this. It just doesn't sound right to
you as you listen. You now can go into Suno and you
know, go over this a little bit later, but you, you can go into
Suno, highlight a section and say change this specifically.
It's still hit or miss, but they've gotten so much farther

(13:52):
in, in the last year in terms oflike being able to edit and, and
that fine tune control, I think is like what I'm getting at
here. So I, I, that's, I have a
slightly different approach because I basically take the
music offline and then use offline tools like ultimate
vocal remover or UVR 5 to give myself stems and that you really
can get creative in digital audio workstations with that,

(14:12):
that stuff, you know, can EQ and, and there's just all kinds
of like techniques you can use for editing the music once you
take it offline. But I think for most people,
they're not going to want to download this song and, you
know, spend 2 hours tweaking it like most musicians will, you
know what I mean? So but but they're getting, it's
just becoming so much easier within those tools to get that

(14:32):
process done, like making those changes.
Stable audio open is actually really, really great for this
too, because you don't, you're not really generating full songs
with stable audio. It's more so like pieces of
sound that you can then layer yourself.
So that's another approach that you can take to, to editing.
But yeah, for the most part for beginners or people that just
kind of like want a, a light easy method for being able to

(14:53):
edit, Suno has built this in. I think I don't have nowhere
near as much experience with this in Udio, but Udio is also
building that out. And I saw Refusion was really
surprising to me because like the last couple days is the
first time I had to use Refusion.
And they also have basically caught up to where Suno is in
terms of editing within, within the service itself.
So they're all kind of like, youknow, making it easier to to get

(15:16):
that done. Nice.
And I know with with custom mode, you can put in your own
lyrics and what not. Do you know, are there keywords
like for example, could I trigger it being a bridge so
things slightly change? Or could I say, you know, insert
guitar solo here? Like does it have that level of
control? Yes.
And and there are people that I call from from the Suno and Udio
communities, like in their discord that I call like prompt
masters. I actually have a good example

(15:37):
of this that one guy, I think his name, how how you pronounce
it as Elgarrath, if I'm not mistaken, and he is a like in
terms of being able to tell the model to craft music a certain
way. His prompts are, I think, almost
anything he produces on his page.
If you go look how he actually prompted the lyrics section,
It's it's so informative. It's so I love looking to go go

(15:58):
look at his creation, the generation.
So without playing the song, it it it makes more sense if we can
listen to it and you can kind ofhear it at the same time.
But his prominently, like the the way that he crafts the
sections of the song via lyric is insane to me.
This this is one example or the other example I have is yes, So
so this one was really cool too,because it in the lyrics, you

(16:20):
typically just put song sections, like you say, the
bridge, the hook, the chorus, but he put woman on the phone
and then man on the phone and the model actually generated
that, like them saying these pieces.
There's another audio song that that does this really well.
Like it's in terms of like an intro.
I don't know if you're a fan of like Bruno Mars, but it seems
like he does this on the beginning of his songs a lot.
And and they it was like the perfect like Bruno Mars intro to

(16:43):
assault. Like they did it perfectly.
And this is model generated. You know, I mean, like they just
type in the the lyrics section and the model is able to do
that. But but these are the best
examples I can think of, like people take like making the most
of the lyrics section when it comes to like directing the
model. There's a couple of other people
like this in the community, but man, guys like this go above and

(17:04):
beyond in terms of like what youcan, what's possible with the
lyrics section. Yeah.
I've actually never seen people like stack tags to kind of
really customize the section. This is super cool.
Yeah. I think that's what I need to go
through that compiler list. This has kind of been like a me
just being a geek and like goingto look at different people's
profiles and like, you know, yousee a person talking a lot in
Discord and you're like, let me click on his profile and you're

(17:25):
go see what he's generated. But but there's definitely some
prompt master like stuff that I've never would have.
And I've been writing music for my entire life.
Like it's been 20 years since I wrote my first song and I would
have never thought to have crafted this way.
You know what I mean? When, when, when speaking to a
model, I've done a lot of cool things with like lyrics and
generations, but nothing like this until I saw saw stuff like

(17:46):
this from people in the community.
So there's definitely a lot of cool stuff you can do with the
with the lyrics section. Do you ever use a model to help
you with their lyrics? Me not being a Music Writer,
I'll sometimes just take what's said generous and sooner right
away. Sometimes I'll go to Claude, try
to tweak it a bit and then shovethat in there.
But it's because I don't have the background, right.
So I'm just like, I need something that covers this
point, this point, this point. Make it funny or make it sad.

(18:08):
But yeah. Do you use anything that have
you noticed which models are better?
Yes, I, I, I think clawed modelsin general are just better at
being in touch with human speech.
Like, like just in, in, in general for code for everything.
So I, I do think they're, they're better at creating
lyrics. When these models first came out
or when I was crafting the firstworkshop, the models just
weren't great at creating lyrics.

(18:29):
I also think, let me say this first, I also have a bias
because I've been writing lyricsfor a long time and I think
there's like a certain human touch or like personal message
that I'm always trying to put inlyrics.
So, so I'm probably not the bestresource for like which models
are our best because I I just have a bad habit of like, I'll
just try and do that part myself, especially when I'm when
I'm crafting my own music. But I do think Claude is

(18:53):
especially 3.7. It's it's overzealous, like with
with code and with lyrics, but but I think it's it's really,
really good at helping you bringhome points that you're trying
to integrate with. Like if you have an overall
message or like some subliminal themes that you want to put in
the music, It's good at like weaving them into an already
existing song. If that if that makes sense.
The one thing I can also say that you have to be careful with

(19:13):
is that the bat the lyrics. This has not changed, not the
lyrics, the models. The models are not great at
because they count by tokens. They're not great at counting
syllables and that's what really, really matters for
making sure that your music is or your lyrics are in time with
the music. So, so the models are just not
great at that, unfortunately. But, and I think so I have seen

(19:37):
other people in the community that are trying to create data
sets that are able to fine tune models for them to recognize
syllables instead. But I, I haven't seen anything
from any major research labs that like is super useful in
that area. So you just kind of have to hope
Claude gets it right and then, you know, take out a syllable or
two and, and make the, the adjustments.

(19:58):
But, but, but they are good for like weaving and messaging.
I, I can't say that much like the joke songs they're saying to
a friend. They do a pretty good job of,
even if it's the incorrect amount of syllables kind of
making it work in a way or dragging something out of it.
It just depends on what level ofperfection you're looking for.
Would you mind giving us a little demo walking through like
how how you actually go through the process?
Yeah, yeah, absolutely, absolutely.
So I, I put up all of the services that are easily

(20:21):
accessible to folks here on Sunois is where we got started or,
or how I typically get started and I'll go to.
Oh, the the one thing I was actually going to do.
Let me make sure I have that putup as well.
I had in my slides a couple of use cases.
Nice and I wanted you to help melike pick.
I thought it would be cool to make it a tool you song.

(20:42):
I did like. I think that would be pretty
cool, but but you can use them for education.
I think of this. I think of like AI generated
music like this, like these are songs that otherwise would not
exist. So like, why not like go crazy
like this is stuff that people just wouldn't take the time to
make manually as a as a human. So like that, you know, the

(21:02):
floor is open, whatever you direction you think we should
take and then I'll I'll show youmy process for creating.
Nice. Yeah yeah, a quick note of that
too. I I definitely see people's
concern around the ethics behindAI generated music versus real
artist, but that you're absolutely right.
The counterfactual isn't I will choose a human or an AI to make
it. It's I'm not going to make this
song a real thing. Both AI.
Now there's a thing that exists.I love the idea of a tool you

(21:24):
saw. So let's do that to something
about, you know, teaching peopleabout AI tools they can use.
OK, awesome, awesome, awesome. OK, so help we write a song
about two. So while Claude's doing that,
we'll go ahead and, and and use our regular old normal human
prompts and then and then see what Claude's able to come up

(21:45):
with. Oh, actually, while I'm at it,
let's do we'll put a couple of things to work for us.
Perplexity is another service I like to use.
So because I I think models are unfortunately better at making
prompts than humans are, right? And I think they're only going
to get better. Like they model, not maybe know
how to talk to models better. So this is another fun thing,

(22:09):
but but those two prompts shouldhave us on our way.
And then I'll I'll compare that with one thing I love to do is
like take that the same exact prompt and put it in multiple
services to see like which song comes out the best Totally.
So I will also take that prompt and say I'm going to edit this
down because I think it's fairlyverbose and actually the core of

(22:32):
it is here. It should be fun.
So, so I'm thinking like 9:00 PMfirst.
Let's see, let's see if that this is stuff.
The hook first thing is not something I typically do, but
I'll try it. OK, nice.
And even even like the BPM, justjust as this generously goes
through music theory and whatnot, where do you think

(22:53):
people, if they want to get a little more creative, should
explore first? Like, is the beats per minute
something that people should really start like trying to pay
attention to and hear how to affect songs or there is
anything else like that? Yeah, as I, so I've had to do
this as a human, like, and I think my, my understanding of
music increased a lot when I started to.
And I've been doing this since a, since a kid, but I didn't
know what I was actually doing. But now as an adult, I can

(23:14):
almost explain. But like, yeah, tapping the BPM,
there's a service I have that I love to use for when you hear a
song, you can tap and it'll tellyou the BPM.
There's all kinds of services online that will analyze a song
and tell you the BPM. But this is like, you can get it
done in like 5 seconds, right? Like you can kind of like get

(23:35):
the general idea of what ABPM is.
And if you pay attention to my BPM, like my core, the music
that I love is typically around 60 BPM.
Nice. And, and I think you, you
starting to recognize patterns as you listen to the music you
love and you're like, hey, I wonder what BPM that is.
And or you can either go Google it if it's a popular song.
But, but sometimes like I listento a lot of underground music.

(23:56):
So I'll have to like go type tapthe BPM manually.
And I've just recognized so manypatterns over the years of like
what I really, really like or like the mood that I'm trying to
be in and what what correlates with that BPM.
So yeah, I, I think that's something that really, really
important to pay attention to. I think key is another thing
that's also important in chords,but much more there's I don't

(24:17):
it's not more technical debt. It's just a lot harder to
analyze those things compared toBPM, but BPM will will get you
very, very far in my opinion, interms of music and understanding
and and prompting for sure. Totally.
And even I find when I listen tocertain electronic music, I'm
like, how would I describe that?Like like what type of drum is
actually going on in the background?
It's just trying to like more analyze the music I'm listening

(24:38):
to because it's something I've just, I've never been very into
music in regards to like analyzing and performing.
It's just like a change of emotional state for me.
So I'm like, if it gets me firedup, that's what I want.
Like check. But actually thinking like how
is it doing this? It's just been a whole new way
to like look at music. Yes, yeah, absolutely.
I think that's the cool thing about this whole process too.
Like that maybe most, most musicians don't look at it this

(24:59):
way, but I think it forces you to do to understand the
underlying mechanics of music. It really, or at least think
about them like maybe even if you don't understand them
deeply, you think about them in a way that you have not thought
before. And then that that creeps into
the prompts, it creeps into yourcreation and your so it's it to
me it, it's just lowering the barrier to creating music and,
and you do it is a gateway whereyou start to think like I, I

(25:20):
wonder more about this. Let me go.
And now we have the tools where you have the perplexities in the
clause where they can kind of like break down these complex
subjects in a way that you understand it.
You know what I mean? So, so I, I know I, I think it,
it does get a lot of hate. I think it'll continue to get a
lot of hate, but I, I think it is, it isn't an amazing way for
you to like find what you really, really like about music
and enforce you to, to identify those things.
And you know, we get get better at at bringing them about.

(25:44):
Yep. So I'm going to make sure that
my volumes down so we don't get our ears blown out and then I'm
going to play this first generation from us sooner get.
In and let it flow the power youneed.
It starts to grow. So I'll, I'll cut it there

(26:08):
because I kind of wanted to see what the other version of it
was. I, there are two thoughts on
this. I think it's very interesting
that the model chose to put 2 verses next to each other.
Like I think I would have done like chorus verse, chorus verse
as opposed to putting another verse.
And then you typically do chorusinto a bridge.
So, so like that's another thing.
I think like they make interesting choices when it

(26:31):
comes to the lyrics a lot of times, but there's something you
can edit and like the, the editor on here is actually
really, really cool. We should, we should probably
cover the editor, but but is, isit alright if I like like cut
them off early, like don't play the full song.
Like we just kind of get the gist of it and see what the
other services have created. Yeah, 100% what we can.
I think with sooner you can linka song.
So if you want to listen to moreweight, we can always put in the

(26:53):
description. OK, Awesome, awesome, awesome.
That was. Funny.

(27:31):
OK, before it gets into the verse that I like that one way
better than the first one. I like that one way better.
How I would typically approach this is make some changes if I
see fit. And because it's a song that I
think you could potentially use for like a tool use intro or

(27:54):
whatever, I also would probably want to shorten it so we could
like we, we can use this crop function.
It takes a while to get the songloaded in, but yes.
So essentially replace section is where you can highlight.
I think the minimum they, they lowered it recently to maybe 3
seconds or so, but I think you can highlight any, let's just

(28:17):
say 10 second section and say what you want to change about
that section of music, whether it be like the change of lyrics
or let's change. I, I don't want the piano there,
I want the guitar there. Or you, you can instruct the
model to make changes on your behalf, which is really, really
cool. It's very hit or miss.
It's very hit or miss. Like sometimes you get exactly
what you had in your head and then other times you'll just

(28:37):
burn through credits generating section after section after
section. So that's one thing to be
careful about. But but the replace section is
probably the coolest tool in here.
Everything else is almost like standard digital audio
workstation stuff. But, but you can, you can also
the the extend feature is, is cool.
That's also new because the Sunomodels, the V4 model

(29:00):
specifically at 3.5 was like this as well.
But the the V4 model is already creating full, you know, 3 to
sometimes 5 minute songs, 4 minute songs.
You don't necessarily, I don't use this, the extend section a
lot unless I'm trying to extend.Like let's say I wanted the
chorus to be longer, the the verse to be longer.
But, but, but typically you like, you want to crop more than
anything. So I use this section as much

(29:23):
and and the remove I think is very specifically about
instruments if I'm not mistaken.And and fade out as a new
feature that they added recentlywhere you can like instead of
having like. The model already does a good
job at this. I don't think this is this.
This fade out section is also asuseful either.
But the fade out is for let's say the intro.

(29:44):
The chorus and verse #1 is perfect to you for a tool use
intro for a, for a, for A2 use theme song.
You can now take and fade out verse 2 and and instead of
having this abrupt cut, you'll like the song would naturally
fade out. So so it's it's useful for
certain use cases. If if you want to make this the

(30:07):
song shorter or the song that you've already generated ends
abruptly and you want to fade itout again.
Like the models have gotten a lot better at this recently.
So the fade out section is just like not all that useful in in
my opinion. But yeah, this is how I think
about the the tools anyways. Can we OK, replace is not
loaded. So let's say like I was actually

(30:28):
gonna re prompt for this, but let's say insert.
Do you have the ability to edit the lyrics at this point or is
it just like the segments of song?
No, yeah, yeah, replace section is is you can edit the lyrics.
Like like if I go in and say generate replacements and these
are technically entirely new generations.

(30:48):
So it uses that amount of credits #1 and #2 it kind of
also crowds up the library as you're creating music in Suno.
So one thing that they've also implemented within the last few
weeks is like a workspace section.
So we didn't go over that because we didn't, we haven't
like gone to look at my library quite yet or like, you know, got
into that part of it. But essentially actually while

(31:09):
it's generating, it's already done.
But but just so it makes a little bit more sense here.
You can create workspaces now because typically everything was
in the same workspace. And if you generate a lot of
music like this can get a littlebit chaotic, if that makes
sense. So you can make a workspace
specifically for, let's say we have tool use live.

(31:32):
And then you can put all of not only the original generations,
but like your edits that the, the, the replacements that
you're doing, the the fade outs,the, you know, if you want to
swap one section with another, all of those brand new
generations can now go into the same workspace.

(31:59):
Yeah. So, so as you can see there, I
finally like replaced the word that we were trying to replace.
So you can definitely edit lyrics.
And like, let's say that you didn't like the drums in that
section and you wanted the drumsto do like a breakdown.
You also can, let's see, talk about the styles that you want
to include or exclude. So like, again, hit or miss, but

(32:20):
I'm, I'm going to attempt to, how about this, let's add
symbols. It it may not fit, it'll
probably sound out of place, butI'm going to try and add some
symbols in this chorus section. Well, this is really cool about
the, the creativity here with this, you can try anything and
be like, you never know what's going to stick.
And I also like that the replacements are kind of
generated separately. Then you can choose to apply it.
So you can generate, you know, 10 variations be like, oh,

(32:41):
that's the one. Pull it into the original song
without having to read you the whole song 10 times.
I wonder if it'll do like if it'll like as far as far as
timing goes, I have like where the crashes would go in my head.
Like if I had a digital audio workstation open.
I was like OK let me put the crashes here but let's see plug.
It in and let it flow. The power you greet, it starts

(33:04):
to grow. Speak your mind, let curiosity
fly. Teaching the world to AI.
OK, that was a crash. OK, I can't complain.
It got the job done. I feel like I don't like the
drums as much now, so I'm probably not going to apply that

(33:25):
replacement. I feel like the the original
drums are much better, but this is just an example of like how
you can use this tool and like how how creative you can be with
it. Love that.
Absolutely. It's just come a long way since
last year because like they the editor wasn't even into, you
know, last year when we when I was in the workshop.
So it just come a long way. It's it has this kinks like to
work out and lots of bugs from time to time, but it's a very

(33:47):
cool tool for like being able toedit generations.
I can't say that much. Yeah.
And you had mentioned burning through credits.
I seen your account, you got 10K.
Seems like plenty. What would be like a typical
monthly usage for either you or someone who maybe takes a little
bit less intensely? Yes.
So, so they have a couple of different plans for like Suno
and I would say the $10 a month plan is probably enough for like

(34:09):
I, I use the $10 in a month planfor definitely, definitely more
than a year. And I've never even on the $10 a
month, but I've never run out ofcredits.
So it's more enough and is the only difference credits or do
you get access to like premier models or anything like that if
you go up? Yeah.
So for the premier plan, you typically like you're, I think
considered a beta user and you do get access to some of the

(34:29):
tools. I think you can provide feedback
to the team as well. There's like AI, don't quote me
on this, but but it's either Suno or Udio that gives you a
badge and discord. So you get like different
channels or whatever. Yeah, but that's the that's the
gist of my creation with Suno, as you can kind of see, like we
come up with some general ideas like use a model to kind of like

(34:49):
flesh out the the prompts. I think you you just have so
much more control when you have when you add the lyrics.
But it does take a a bit more, it's more a little bit more
intensive than like, you know, Iwant to make a song for my dad,
you know, trash talking to the other team, like the opposing
team. That's like quick, easy.
You can throw in a prompt like you could prompt it in in the

(35:11):
span of a couple minutes, you could probably prompt it 10
times and finally get like closeenough to like something that's
easy to send off as you get a little bit more granular.
And and again, you want that fine tuning, that control.
I I think you are, you really are just better off like putting
your own lyrics in kind of like having that control from the
beginning when when generating new, new music.
So that's that's my opinion on to know and really for audio and

(35:32):
refusion as well. Like I, I see it's much the same
like starting with lyrics and like the BPM and again, it's a
little bit more nerdy when it comes to the music stuff and
like knowing what what technicalterms to use.
But like starting with those things give you gives you a lot
more control when it comes to, you know, generating songs.
Yeah. And before we move on to the
next one, I just when, as soon as I see styles and whatnot.

(35:53):
I'm curious if you've personallydone any like weird crossovers
or any like blends that you thought were just experimental
that actually worked. I tend to really stick to like
the core music that I know well cuz I feel like I'm just not a
great. I have tried some weird stuff,
I'll be honest, but it's not something that I would listen to
on a regular basis. So it, I, it's not something

(36:14):
that like struck A chord with mepersonally, you know what I
mean? But but definitely like IA
certain part of my family is from Mexico.
Like my, my wife is Mexican. So I, I typically like if I'm
going to play around, like if I want to get on her nerves, I'll
like combine my stuff with like Mexican music, bachata, like
like the different styles of Mexican music and like that

(36:36):
stuff comes out really, really interesting.
But yeah, it's, it's definitely experimental.
And I don't know, I guess it depends if it like music is so
subjective, right? Like like maybe there are,
there's an audience out there for it is just like, not me
personally, you know what I mean?
But I have tried. I will say like probably my
craziest I'd have to go back to.I do so much more of this stuff

(36:57):
in Udio than Suno just cuz like I like the songwriting process
in Udio. But my, my craziest crossover so
far probably is like I tried to do movie music.
I, I got into sync licensing a while back, like, which is like
essentially licensing your musicfor movies and TV.
And I tried to do a combination of like, I think it was movie

(37:21):
theater scores and like reggae music and that turned out super
interesting. It is a cool community because I
so I will say like the explore tabs can also give you ideas in
terms of like what other people are generating as well.
So I, I think that's, that's something to check out.
Like that's how I found the the prompt master of people that I
was talking about. But I will say that it's I don't

(37:43):
think it's quite tight knit of of a community as like let's say
mid journey where they had everything kind of going in
discord. But there there's definitely
like a community of of Suno generators as well as like audio
audio generators. They're they're much one in the
same. But yeah, there's some wild ones
here. Yes, Bengali.
American Primitive. Nirbachsora Sonali.

(38:22):
This, but this this one I was kind of talking about with the
the I don't know what you call it, fusions, crossovers,
hybrids, where it's just like anything that you can think of
creatively you can put into hereand see what comes up the other
side. Yeah, definitely.
I would say this is a much better resource than me for like
trying to like attempting crossovers because some of the

(38:42):
Mandarin acid trance is crazy. The crime surf rock shell
sentimento. Yeah, some of these are crazy.
Saxophone barbershop. I can't imagine.

(39:02):
In a world doesn't fading shaping style, silver tongue
saxophones, jive and beguile. The OK I'll take it Hindi
acoustic girl. Yeah, that's.

(39:48):
Super interesting 11 cool thing.So one of my favorite parts
about Suno, it was the first tool that I showed my dad and
that like first aid tool that hejust clicked with them.
Took a lot of his songs he wroteback in the day, put him in, was
able to generate it. And now every family occasion
comes with a song for him, whichis really funny.
And we have a cousin out in Montreal just recently had a kid

(40:11):
that the first one of that new generation, but their last name
Lammeroux being a French name with English lyrics.
Sudo had a bit of trouble with that and he just had to get
creative with how you spell it and how you tweak it.
So it's really interesting how all you have to do is kind of go
through trial and error, run a few different things and just
tweak it to be like what you want to hear and then you get
cool things out of it. That's actually the last slide
is like, I think you have to trythis stuff like there's no

(40:32):
amount of tutorials and like that you could watch like it's a
great introduction, don't get mewrong, but like, until you
actually start to come in and like tweak and play and see
what's possible, like it, it's very hard to explain.
Like I have a couple of friends that like, obviously I'm a
musician. I have a bunch of musician
friends are artists, producers on both sides.
And almost all of them had like,these notions about what AI

(40:57):
music was until I I, I had to force them to sit down, open the
browser. I'm literally sharing screen
like, I am with you now. And I'm like, let's make
something like, let's see what this process is.
And almost all of them had theirminds blown when they saw, like,
what it actually was compared tolike, oh, you know, you hear in
the news what the AI music and it's going to, you know, destroy
the industry and it's this and that.

(41:18):
And it's like, yeah, I think it's much different than what
people make it out to be if theyhave not tried these things
extensively, you know what I mean?
So, yeah, very, very similar stories with my friends and like
them, like it kind of clicked assoon as you start to generate
stuff, you're like, wow. Like, and then they're like, I
would never take the time that it takes us as humans to create
a full song about this topic or this for this event for this

(41:39):
company or whatever it may be. But now if I can like type it
in, like get it done in, in 1530seconds or whatever it may be,
like I'm I'll sure, I'll, I'll, I'll throw a couple prompts in
there, you know what I mean? And see how it turns out.
So I think that's that's what's really cool about these services
anyways. Yeah, I mean, it's one of the
things like I use it kind of like saying generate group chat
songs, something just make the boys laugh and and it's great.

(42:00):
One use case I haven't seen actually used yet, but I
guarantee you we'll get soon is when you walk through the
department store, they kind of have that, you know, the Top 40
radio music they kind of hear. I can't.
Well, I I wouldn't be surprised if very soon they're going to
start like subliminally putting ads in there where it'll be like
a custom song for Walmart or whatever.
And it'll like subtly mention the sales or things you got to
check out and feel as if you're going along not even paying

(42:22):
attention and just have that kind of like floating in.
So for better for worse, I thinkthere's going to be a lot more
customized, but also just like music originally like
centralized creation, centralized distribution, right?
Like recording studio and then record store or radio.
Then with things like Spotify orApple Music, all of a sudden you
get decentralized creation with with pretty much centralized

(42:44):
distribution. But now anyone can just make a
song and send to whoever. And it's been this unique path
of things becoming more and morecustom where it used to be
people generally like the same music and then people like their
little subgenres of music. Now it'll be songs that I like
that no one else in the world has even heard except for like
maybe my group of friends. And it's just such an
interesting progression how we're getting more and more
accustomed to whatever it is that we like.

(43:04):
Yep, absolutely. And I think we touched on that
earlier. Like this is music that just
would not exist to like half of these combinations are probably
more than half, like maybe have not even been attempted by a
human, you know what I mean? So it's just like exploration
and again, possibilities that didn't exist before.
So yeah, I think it's, I think it's not positive.
I think like maybe the way that these models are trained and

(43:26):
like some of the some of the music that that's being used to
create the models is like where it gets like, you know, moral
Gray area. But yeah, that, that's, I, I try
to reserve my opinions on that, like my comments on that because
like, I think I just want to letit play out in the core system
how it does. And then we can kind of like
see, you know, from there. But I, I, I think net positive

(43:49):
though in, in my opinion. Yeah, absolutely.
Ben, was there anything else youwanted to show us?
I mean, how do you feel about trying out some of the other
services? Do you think you have time for
it or, or what? Yeah, Let's just do a quick run
through just to make sure peopleare familiar with how they can
use the tools. Like you said, they're kind of
better for shorter things. So maybe instead of a full song,
we just generate a bit of output.
I did kind of want to break down.
I mean, one thing about it, I did the workshop last year at

(44:10):
the AI engineer event. And for more of a history or a
background, If you want to see live examples of how these
things kind of like came about, or if you want to see more
concrete examples of, of like all of the different types of AI
music or styles of AI music, I would say to refer back to that
video because I think foundationally, like I, I
covered way more ground than than we did in in this

(44:32):
interview, which is OK. Like I, I love the direction we
took in this interview. But like, if you do want to see
some more examples, it's a it's a great resource at yeah, AIE
talk.com/music will get you not only the video, but also like
that my original slides, the transcript.
If you like to like feed it to amodel and ask questions.
And in these slides, I've named PDF on this page and it'll take

(44:55):
you back here. But that's a nice resources
basically that you can find. I'll also say the notes is
basically all of the it's Mike'smy cited sources as I as I kind
of like compiled all of the notes for the workshop.
I put them all in this one location.
I've also updated. There was a couple of broken
links and I've also put in linksto any of these services that we

(45:16):
talked about today as well as the open source models.
If you have the machine, if you have the hardware and you want
to download it, this is also a great resource.
So anything that we don't cover today or I maybe mentioned and
you were wondering about, I justask that you like refer to the
notes and if there's something specific, please put in the
comments like that's that is I'mlike my 11 call to action.
We kind of like veered away fromlike what like the the slides

(45:38):
and the notes, but all this stuff is here.
If you want it as a resource to go find I will.
I will also say just because we mentioned that some of the
legality of like the morality ofAI generated music.
I have all kinds of links here in terms of like the policy that
the government's kind of like forming.
I actually have some stuff to update here as well because like
as recently as last month, the US Copyright Office came out and

(46:01):
made that a little bit more clear what their stance is on
can AI generated music be copyrighted.
So all we didn't get into the Weezer this stuff, but but it's
it's all here. I've kind of like tried to
compile a nice like archive or repository of information if you
are more interested in that thatside of things.
That being said, I think we can kind of go into audio unless,
unless you had any other comments about it or phenomenal

(46:22):
resource. We'll link it down below to make
these real fine because having aresource like that is super
valuable. Awesome, awesome, awesome.
Audio also has like an editing process.
In my in my experience, it didn't.
It doesn't quite. We'll see when the generation
finishes. I'll kind of show you some of
the, the editing stuff that theyhave in here.
I, I can say overall, much, muchlike Suno, it's like under

(46:43):
construction, hit or miss, whether or not you get like the
results you're, you're like the results you have in your head,
but it's still cool, like creative tools that you can use.
Like you were alluding to earlier.
There's just less development inthe space.
So little, little slow to catch up.
But there is one of those aspects where when people can
like, like feel it and feel a connection to it, it just seems
to, to resonate so much stronger.
So I hope there's more innovation in the space because

(47:05):
it's just it's something that I think people will get a lot of
value out of. Yeah, absolutely.
I kind of want to throw. I mean, they let me to do
multiple generations at a time. So let me see if I can put the
rock version into here. Running on fumes, these gears

(47:29):
grind to a halt In the chaos we find our way.
We rise, shadows fading under darken skies, tools in our house
to scout our own fate of the weekend.
Yeah, even that they're like skipping sections of we can try

(47:53):
again with like lyrics just to see if it turns out any better.
And I'm going to try and format it how audio likes it, minor
edits here, and see what comes out.
One question I should ask the model what I think maybe soon.

(48:14):
No benefits from like having theBPM as a part of the prompt.
That makes sense. It kind of went off on a
tangent. And then refusion is much the
same, I guess. Like that's, that's one thing I
can't say about all the tools. Like it's they, they, they all
have like they're all like little quirks between them.
But for the most part it's, it'sthis process.

(48:35):
It's like typing in the the general style or genre and then
sending that off or general style and genre as well as
lyrics and seeing how that goes.Artificial intelligence

(48:55):
revolution. Sick in the dark with a blinking
cursor, Error messages flashing.Couldn't get much worser than I
found this podcast that changed my life.

(49:22):
Hey. That's how I'll take it, man.
Yeah, And just even alternating from from the band to just the
singer. Love it.
Yes, yeah, yeah, yeah. And I think that's more the the

(49:43):
like, like I said, I'll show youthe prompt master guy.
You can get really detailed, butthat stuff as you kind of put in
like whatever you put in brackets, I think that kind of
like really steers the model. So you can have like like the
like a chorus and then put like,you know, choir and then or
backup singer or, you know, whatever it may be to kind of
like force that that out of the generation out of the model.
But it's cool how sometimes it will like generate that stuff on

(50:04):
its own. I'll make sure I send you the
links to all these so you can kind of like, you know, have the
editor and, and, and, you know, for the audience stuff they want
to, you know, take a listen. I, I think the, the one thing
the that comes to mind the most on this is like, I would
probably coax Claude to write lyrics out a little bit shorter
that could fit in this tempo better.

(50:26):
And then I would try and like more like a heavier integration
of the word tool use. Like we want that in the chorus
or whatever. So this is me like, you know, I
really want to make a tool use theme song.
Like I those are the changes I would make.
You know what I have to do maybeoffline.
I'll like like really put in a bunch of effort into like making
a better tool use song and that's it.
And then send that over, you know, a couple different

(50:47):
versions of it and see what you think.
Not, not going to say no. We'd love to put it up.
Yeah, I guess my again, my main message to the audience.
So like for anyone that has liketheir, their reservations about
air generated music, like I think the, the one thing I have
told my friends is that like, it's coming one way or another.
Like, I think it's just best to embrace it.

(51:08):
You know what I mean? Like the, the, the change is
coming one way or another. Like AI is here to stay.
Whether or not you think it's moral, I think it's one thing,
But but like, I would say embrace it.
Get, come, get, come join us in the water is what I would say.
If, if, if you haven't yet and you love music, I would say to
like try some of the tools out. If you hate it, you know, so,
so, so be it. But I think I think it's

(51:29):
important to try it out at the very least.
Now, I would also say that like I, I would not have got into
music as much as I am now if it weren't for the barriers being
lowered 20 years ago. Like I was able to, I lived in
Germany at the time. I was able to go to our base
exchange and go buy an $8 microphone.

(51:49):
And I, I messed up my mom's computer trying to like jam a
card into the PCIe slot, you know what I mean?
Like if it wasn't for those things being affordable, I just
wouldn't have access to music asI have now, you know what I
mean? So I think like, I, I want to
pay it forward and, and, and there might be other people that
are that, that are like that love music, but may not have the
time or the energy to like, put in the effort to like learn an

(52:12):
instrument or learn how to writemusic from scratch.
But but they still can like, youknow, create music that they,
that they enjoy and they love. So that's kind of like my
overall message. I think to everyone and for, for
your audience specifically, man,I know you guys have a bunch of
smart people that watch this part, this podcast.
I'd like watched a few other episodes myself.
And I think my my call to actionthere is like when you are

(52:33):
creating music with AI, what, what difficulties are you
running into? Like what, what, what do you
have the most trouble with? And, and because I again, that
just helps me. If you could leave that in the
comments that that would, I would love that as as like some
feedback so I can further tailorthese talks in, in workshops.
So that's, that's all I really, yeah, I mean, absolutely.

(52:55):
I encourage everyone to do the same.
I loved your point of appeal. Just diving in.
I encourage everyone yet make a song.
Just find something that you're you're learning about right now
or something that you're interested in and just make a
song, you know? Then afterwards, go go to see
some live music and support the artist that way.
That way you can kind of bounce with ways.
Dude, this was a lot of fun. Really appreciate you coming on.
Before I let you go. How can the audience keep up

(53:16):
with you? What any other message for them?
I am on Twitter under Young Flow, so you can send me music
here. It's just YOUNGPHLO under score
and and then look for the resources themselves.
If you just go to aiutalk.com/music, that's where
you can find all of the extra information that we didn't touch
on in here in the interview. Absolutely.

(53:37):
And in case everyone's curious, all the music in this episode is
AI generated. So if there's anything you like,
let us know. Yes, absolutely.

All Episodes

Episode Transcript

Popular Podcasts

On Purpose with Jay Shetty

NFL Daily with Gregg Rosenthal

The Joe Rogan Experience

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}How To Turn Text Into Amazing Music (ft Phlo)

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}On Purpose with Jay Shetty

NFL Daily with Gregg Rosenthal

The Joe Rogan Experience

All Episodes

How To Turn Text Into Amazing Music (ft Phlo)

On Purpose with Jay Shetty