How You Can QA Contact Center Calls Using ChatGPT (Desktop)

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
This is advice from a call center geek a weekly
podcast with a focus on allthings call center.
We'll cover it all, from callcenter operations, hiring,
culture, technology andeducation.
We're here to give youactionable items to improve the
quality of yours and yourcustomer's experience.
This is an evolving industrywith creative minds and
ambitious people like this guy.

(00:21):
Not only is his passion callcenter operations, but he's our
host.
He's the CEO of ExpediaInteraction Marketing Group and
the call center geek himself,Tom.

Speaker 2 (00:30):
Laird On the desktop version.
Right, and I said the desktopversion just because that's
where most everybody's at rightNow.
If you're a developer, like we,have API connections to chat,
even to cloud and all that andthat's for another day and I'll

(00:50):
do another podcast so we can getway more in depth, but this is
for you know, maybe that personhas 10 contact center agents or
wants to see if they can developsome type of QA platform
internally using the resourcesthe $20, or even if you have
ChatGPT 3.5 to use that as well.
I am open for questions, right?
I really think that this willbe much, much better, right, and
much more fun If any of youguys have questions on what I'm
talking about.
If there's something specificyou want me to talk about, I

(01:12):
might be able to talk more thanI'll show, but I will definitely
talk it.
I have all my prompts.
I have everything right here,so if there's anything specific
that you want, I am more thanhappy to help the happy to help.
The other thing is mymicrophone stinks, so apologize
for the sound.
My Yeti mic won't plug intothis new computer and I just
didn't have time to play aroundwith it.
So if there's some soundquality.
Things sound like I'm in a poolor underwater, all right.

(01:34):
So let's start at the beginning.
When we started AutoQA and westarted experimenting in R&D, at
the very beginning we were justusing the desktop and we would
take a call transcript and wewould upload that transcript
into it and then we just startasking questions.
That's where we started.
Then we developed, you know,kind of the learning more about
prompting, right?
I mean, prompting has only beenaround for about a year, year

(01:55):
and a half, right.
So we went to school on on thatand what are the?
What are some of the thingsthat that help to make things
more consistent and make thingsbetter?
What are some of the thingsthat help to make things more
consistent and make thingsbetter, right?
So the first thing is how wehave this set up.
Is we created a system prompt,which is basically the prompt
that says tells ChatGPT what itis, how do we want the outputs?

(02:15):
You know it kind of gives theinstructions, right.
And then we got into the askingquestions and prompting for
those questions, right?
So you know there's really fourmain pieces to this.
Right, a system prompt thatwe'll talk about.
There's your form or thequestions that you're asking
with the we call it the context,but it's basically the prompt
for that question.

(02:36):
You have the full calltranscript and then I guess it's
kind of part of the systemprompt but the output that we're
looking for that.
So the first thing is you pullup your chat GPT.
You know, and I can kind ofgive you this, I think this is
pretty basic.
You know you are the head ofquality assurance for a contact
center.
You oversee answering questionsfor quality compliance using

(02:57):
the score provided system toassign scores.
You will examine the attachedtranscript of the answer, right,
everybody can figure that out.
That's not proprietary anything.
But then you start to thinkabout how do you want this thing
to look right?
So we're using, you know,different outputs.
We are looking in our systemprompt for things like the call
type, which I don't think onChatGPT, if you're using the

(03:22):
desktop version, maybe you can,you know, upload a couple
different calls.
One's a sales call, one's aretention call.
But if you can kind of namethose call types or explain on
the desktop what a retentioncall is, it will pick that up.
You hit enter on the transcript.
So that's one output that youcould play around with Sentiment
scores right, so you can haveyour kind of.
You're the head of qualityassurance and then you could

(03:43):
tell it hey, I want to docustomer and agent sentiment and
kind of some of the things thatwe're doing in our prompt for
this, in our system prompt.
This involves a nuancedexamination of language, tone
and the context presented in thedialogue, pay close attention
to key indicators such as wordchoice, intensity of expressions

(04:04):
, any shifted mood over thecourse of the conversation, and
there's much more to that.
With what is positive, wedefine what is negative and just
kind of define sentiment.
That's another thing that youcan have right on your desktop,
right, so again, you're the headof quality assurance.
Next prompt is we want to dosentiment score right For this
call, that we're doing this onecall kind of thing.
Then you can see how you can.

(04:25):
We want to do sentiments, allright, and there's a ton of
prompts out there for sentimentand how to get it from from
positive, negative, neutral,using a transcript.
I think I even have it on ouron our blog post.
We have a JSON output that wetell it to do as well, which you
can do on the desktop Right Tokind of say, hey, this is what

(04:46):
customer sentiment is.
This is what agent sentiment is.
We tell it, which I think isreally cool.
We call it the rationale.
Agent sentiment is, we tell it,which I think is really cool,
we call it the rationale.
So we say things like hey, youwill explain the rationale for
all the questions, regardless ofthe answer, right?
So we tell ChatGPT to basicallytell us why it scored, tell

(05:09):
Otto why it scored, how itscored, and you can make those
things as robust or as small asyou want, right, you can put
like under 40 words.
You can say, hey, pull theexact part of the transcript
where this was said for yourquestion, right?
So you're layering this rightFor what is important in your
contact center.
And again, I'm doing this forsomebody who has very little IT
expertise.
But let's keep going back tothis right, building this kind
of hamburger, right, you havethat, that.

(05:30):
You're the head of qualityassurance or the head in a
contact center.
Let's talk about sentiment.
Let's let's build a sentimentprompt.
Let's talk about the rationale.
We want to know why.
You know when chat GPT scoressomething, what is the reason?
Why did it say yes?
Why did it say no?
I think that's really reallyimportant stuff that we have
found to be really important inthe output.
The other thing that I think isreally important too is, again,

(05:55):
if you don't have anyprogramming expertise, just
certain things that you can askfor right for ChatGP to give you
back right.
You want to know the fourthings that the agent did well
on the call, four things thatthe agent could have done better
on the call right.
I think those are the core mainthings that we're kind of
looking for right.
So right in our system promptfor auto QA, which is this big

(06:16):
giant long thing.
It doesn't have to be for you,you're just again, you're asking
, you're telling it who it is.
Let's do a system sentimentprompt.
Let's talk about the fourthings that the agent did well,
four things that the agentstruggled with.
And then everybody's payingmore for auto-summarization.
Right, ask it to give you thefull cost summary there as well.

(06:39):
So, on your kind of chat GPT,your internal system prompt that
you're utilizing and this issomething that you can use a
personal GPT with for, use thatas your system prompt.
If you want to kind of look toscale this a little bit.
It's going to be a crazy manualprocess and you're going to
have not great outputs, but it'sgoing to give you the general
gist of what's happening oncalls.
We're looking for sentimentscores, the overall score of the

(07:05):
call, four things that theagent, four things that they
kind of struggled with, the fullcall summary, and then we get
to the actual questions.
To struggle with the full callsummary and then we get to the
actual questions.
So how we have, how we found thebest way to break this up, is
to tell chat gpt, even on yourdesktop, um in the system prompt
, if you have one form that this, the.
The questions will be broken upinto three sections.

(07:25):
The sections are greeting, um,etiquette and closing
information or sales information, whatever it is to you guys,
you will tell it how you want toscore.
Right, you will score a yes,I'm sorry, answer, we screwed
that up.
You'll answer.
If a call is answered with a yes, a question is answered with a
yes, score five points orwhatever it is.
If a call or a question isanswered with a no, score zero

(07:48):
points and you can score NA,which is no points at all, right
.
So kind of set your criteriafor five points, zero points or
zero points, but zero out ofzero instead of zero out of five
for an NA.
So now you just told it how toscore right, so you don't really
have to go through eachquestion and tell them how many
points is, unless you have some.
You know this is worth 10points.
That's worth five points.

(08:08):
Then you can tell it.
And then this is where the funcomes in, this with five points.
And then you can tell it, andthen this is where the fun comes
in.
This is where, literally, youknow that system prompt took us
five to six months to reallyfigure out what we wanted to do
from an output standpoint, howconsistent that we wanted it to
be, getting the real outputs,how we wanted to even look on a
form and an evaluation.
And then we get to thequestions right and that's where

(08:28):
the rest of the you know seven,eight months of work have come
in.
That's why the lawyers don'twant to totally have me give
away all that stuff.
But some of the stuff is verybasic, right From a question
like to start did the agent usethe proper scripted greeting on
an opening, on the call opening?
Right, that's a question numberone.
And then say prompt or put a P,put prompt right For question

(08:50):
number one, and then it is theagent customer or the agent must
say thank you for calling XYZBank.
How was your day?
We appreciate your call.
Whatever that is right and youcan do things like it has to be
exactly this or has to besomewhat like this.
It will definitely pick up thenuances of that.
So you just kind of do that andyou trial and error each of

(09:11):
your questions right on thedesktop version, right, question
one, question two, questionthree all right, this is the
next section, because youalready told it, this is the
etiquette section.
All right, this is what empathyis.
If you go on to my blog post,that is, on the autoqacom blog,
there is a ultimate guide toauto QA and I have all my
prompts for all the basic callcontrol, empathy, openings,

(09:34):
greetings, like all that stuffis there.
So I don't I don't really wantto go over that again, but take
those prompts, just copy andpaste it right into your chat
instance, right, and then hitenter.
See how your output comes.
If your output isn't in theformat that you like, then go
tweak that prompt.
But once you get it to where youlike, again, I think that's
where the cool personal GPTs canreally come into play for some

(09:55):
basic stuff, to where you canbasically upload, especially
your system prompt, and thenmaybe just have your question
that you have already done, andthen you're just kind of
uploading calls, one at a time,and you got to wait for it, but
you can still do that and youcan still score the calls and
you're going to get some reallygood, really good outputs with

(10:17):
that as well.
So that's how we started itRight.
We said can we build a coolsystem prompt?
Can we build the question?
We built all this stuff outbefore the person and the answer
was yes.
So that was our proof ofconcept.
So that's what you're kind ofdoing right now.
And then we just took that andtook it to the enterprise level
of all the APIs that we weredoing for getting a transcript

(10:40):
and then getting the chat, gptand cloud three through the APIs
to connect to those guys.
So it's all the same stuff,it's just one's quicker than the
other.
There's a little bit more timeand effort that went into what
we need to do from a promptingstandpoint, but you can
definitely figure it out andkeep trial and error a lot of
the stuff.
Some tips that I have for you onprompting Um, we have found

(11:02):
that imperative is a is a verystrong word.
So if you say it is imperativethat you score this this way, is
imperative that you look forthis in the transcript.
Um auto QA will do that.
Another tip is to tell it tothink.
If you have something a littlebit more complicated that it's
to take its time on, tell it tothink um in your prompt before
you actually get to answeringthat.

(11:22):
That question.
Um, we've found again, thestructure of it is important,
right.
So for smaller type prompts,having a paragraph is fine,
right.
But if you get to a longerprompt, having it in the order,
having it almost in a checklist,definitely gave us more
consistent and gives us muchbetter results.
The other thing is the modelsdo a great job with keywords and

(11:47):
finding specific things, butdon't think that way.
Think much bigger.
If you look at the empathyprompt that I have on the blog
post, it's basically saying lookfor any instance that there was
a struggle with the customerwhere they had a pain point and
then make sure that the agentresponded to that in a way that
made the call better, made theagent or customers.
So those are some things that Ithink you can really play with.

(12:09):
And this is a great exercise tonot only get better on the
prompting aspect of being AI,but to start to really look
deeper into your calls.
Right, and it's not that hard.
It's not that difficult.
Doing it at scale and havingreporting and doing a lot of
different things is a little bitcult, right, but at its core,
right, anybody who's reallyusing a large language pretty

(12:32):
much, is building something ontop of it and using that as the
brain so that a human beingdoesn't have to do things.
So when I hear you know allauto QAs are doing is writing
prompts, that's true, but goodluck trying to figure it out.
To get really consistent,better than human being data I
just talked to Mark Bernsteinfrom Balto.
They have a, I believe, as yousaid, 96 or 97.5% accuracy is

(12:56):
what their standard is comparedtoa human, which is two, two and
a half percent better than wecan score.
So to get to that level and Ithink that's we are right there
with auto as well it takes a lotof work to figure out what are
the consistencies of this thing,but at its core it's just how
good a prompting are you?

(13:17):
How good are you to really knowthe data that you have?
Here's the other thing too.
Think about this.
Right, you can start to usethis thing to not just QA your
agents but to start to askquestions of the data.
Upload five transcripts andstart questioning it, start
asking, hey, what percentage ofthese customers are?
You can upload more than five.
Let's say you upload 15 chattranscripts, call transcripts

(13:37):
and start querying things off ofit.
What percentage of thecustomers called were irritated?
And start querying things offof it.
What percentage of thecustomers called were irritated?
How many specific customersthought the product was too
expensive?
And you can start to kind ofbuild that, see how many you can
get and start to learn more offof your data as well.
You can do a ton of that stuffwith your personal G, uploading

(13:59):
a ton of different data and thenquerying off that data.
So there's so much you can dointernally to build some things
out yourself.
Even a very basic team can dosome really cool things with the
technology of today.
But again, if you're that personthat just wants to kind of play
around and try to figure outsome really cool ways to just
start to score calls and again Ithink that this bridges to

(14:20):
other opportunities within thecontact center as well, because
a lot of the things kind ofmatch up.
But I think for the core ofthis, know, let's start to build
your system prompt.
Find out what things you wantto know, right, do you want to
do sentiment?
Do you want to do, um, thingsthat the agent could have done
better, things that the agentstruggle with?
You want to summarize the callum, you know, making sure that
we're we're giving the theanswers back of why it's scored,

(14:42):
and have your questions.
Have your, have your, uh, yourpersonal prompts for each
question and start playing withit and I think you'll get some
really good results.
And then take it to.
The next step is get yourpersonal gpts involved.
Do it a little bit more atscale, um, and then you know.
If you want to see a demo ofwhat we're doing at totally at
scale, with with apiconnectivity built to get
transcripts from an audiostandpoint and also to multiple

(15:07):
large languages, I'd be morethan happy to kind of show you
that and how that works.
And then you know usingspecific outputs to create forms
and then reporting on top ofthat, right, it's a process to
do, but I think at its core, youknow anybody can start to play
with this and start to get someresults, at least on a smaller
scale, for calls.
So if there is no questions onthat, you know, as you guys are

(15:29):
listening to this, please,please, dm me.
If you have anything that youwant specific answers to, be
more than happy to help.
But I hope that was kind ofcool and kind of gives you a
little bit of insight into howthis works.

All Episodes

Episode Transcript

Popular Podcasts

Dateline NBC

Stuff You Should Know

Intentionally Disturbing

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}How You Can QA Contact Center Calls Using ChatGPT (Desktop)

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Dateline NBC

Stuff You Should Know

Intentionally Disturbing

All Episodes

How You Can QA Contact Center Calls Using ChatGPT (Desktop)

Dateline NBC