Understanding the role of artificial intelligence in conversation analytics

Matt Dixon, Ted McKenna

September 8, 2021

Here at Tethr, we did the original research around effort reduction, artificial intelligence, and conversation analytics nearly ten years ago. A lot has changed since then, and the discussion around the intersection of effort and AI has evolved well beyond the wildest dreams of anyone involved in those original studies. Today, we’ll be taking a more technical look into the role of artificial intelligence in conversation analytics, and how we use AI to derive insights from your customer conversations. Then, we’ll go one step further, and share some roadblocks we’ve encountered and overcome.

So if you’ve ever wondered how, exactly, our AI-powered machine works—this one’s for you.

The art of decoding recorded conversations

The big question when it comes to deriving insights from customer conversations is this: How on earth do you make sense of a recorded conversation? All the speech processing technologies like transcription, automated speech recognition, natural language processing, and others, are all proposed answers to that difficult question. How do we take a wealth of unstructured data and actually make sense of it? Many folks have been recording their calls for years, and have no way to utilize that information or get insights from it—because it turns out it’s actually really hard to automate the decoding process for a recorded conversation. So the first challenge for artificial intelligence in conversation analytics is applying structure to unstructured data.

Using artificial intelligence in conversation analytics to structure your unstructured data

At Tethr, we’ve arrived at an elegant solution for structuring that data with AI. Let’s break down how it works.

Let’s assume we’re starting with a phone call. (We can apply this technology to other mediums, but let’s assume it’s a phone call recording.)

Step 1: Recording accuracy. First, we take the recording – unstructured audio – and run a series of AI and machine learning processes against that audio recording. This cleans up the recording to make it more accurate and allows us to redact sensitive information where necessary.
Step 2: Transcription. Next, we run the cleaned-up call through a transcription engine. This allows us to turn that unstructured audio into unstructured text.
Step 3: Mine for insights. Finally, now that we have an accurate transcript of that call, we can utilize many types of technology to derive insights from that unstructured set of information and derive meaning from it.

Once we get to step three and get that accurate transcript, by that point, we start to be able to see a lot happening in that conversation. We can see what text belongs to the agent, what to the customer, and so on. Determining what that information means is the next big question. And that’s where the magic really happens.

Classifying that audio-based information

So we'll classify these bits of information into insights that make sense. We have a list of the concepts and events that we're looking for in conversation, as well as ultimately, our scores like TEI and AIS. These scores allow us to assign meaning to those events. When a certain event or behavior happens in conjunction with this customer expressed a certain sentiment, we can draw a conclusion about what are those two things in combination really mean. And what does that mean in terms of big outcomes we want to drive?

That's a big part of what we do here at Tethr. We take that imperfect speech that still contains great insight and turn it into something you can take action on in a really meaningful and predictable way.

We take that imperfect speech that still contains great insight and turn it into something you can take action on in a really meaningful and predictable way.

Now that we’ve covered the basics of how artificial intelligence in conversation analytics can be applied and how we use it, let’s talk about some common challenges. We’ll share some roadblocks many folks experience with this type of technology and how we’ve adapted to overcome them.

Challenge #1: Flawed transcriptions

A common issue when it comes to conversation analytics is flawed transcriptions. We've all seen phonetic transcripts – which used to be very prevalent in this industry – and they look like dictionaries. Those transcripts are full of partial words and don’t reflect the way that people would speak. You can get the gist of what’s happening, but it’s nearly indecipherable as a conversation. The best you can do is keyword spotting, which doesn’t help with context. It's not super helpful in terms of helping you understand the broad contours of what's going right and wrong in that customer experience.

Solution: ASR technology

So, when faced with the issue of flawed transcription technologies, the market has adapted. These days, ASR transcription is the gold standard for transcription, and for good reason. This technology delivers the entire conversation in full words, filling in the gaps that phonetic transcription leaves using AI technology. Not only does this provide us with the ability to find out the issue in that conversation; it also lends itself to more big-picture insights.

Challenge #2: Dealing with tone

Another common analytic method in our industry is tone. Many tools use artificial intelligence to track the customer’s tone through modulation in audio. They try to gauge if the customer sounds upset. This is useful in certain circumstances, and there are definitely elements of tone that help us to understand how the customer perceives effort. (We even use similar features as a part of TEI!) But ultimately, tone isn’t very specific. It doesn’t provide context, it just tells you how someone feels in that moment.

Solution: Utilizing context to drive insight

The issue with relying on tone is that it doesn’t give context. You only know that someone is upset, not why they’re upset. That's why our models aim at detecting context hidden in the underlying emotions articulated throughout the conversation. This is something that we’re able to do with Tethr. We can place those tone insights in a greater context, alongside the who, what, how, and why. We can tell you that the customer is upset, why they’re upset, and what the agent is doing (or not doing) to fix it. When it comes to making real business decisions, that kind of insight is absolutely critical.

Challenge #3: False positives and negatives

Another struggle in artificial intelligence and conversation analytics is the issue of false positives and false negatives. These are results that when you keyword spot, can do a lot of damage to your results. Take the word “transfer” as an example. An agent could tell the customer, “Hey, I'm about to transfer you,” and that word transfer would hit. If you’re a contact center trying to understand the abilities of your team, you might count that as a negative occurrence of the word transfer. But what if it's a bank and really refers to an account transfer? You want your machine to recognize those differences.

What's the damage done when there are all these potential upset customers that you don't even know about?

That’s just an example of a false positive. False negatives are worse; since those often indicate that you’re missing critical information. For example, let's say you're trying to get a sense of customer frustration. Customers articulate that thousands of different ways. A customer might say, “You guys are the absolute worst.” And there's nowhere in that phrase that says frustration, or upset, or anything—but that's clearly a frustrated customer. Now, multiply that out at scale, and you realize what all you could be missing if you’re not catching false negatives. What's the damage done when there are all these potential upset customers that you don't even know about?

Solution: Machine learning-based categories

We address this issue with constantly evolving machine learning-based categories. A category is a bucket or a basket of phrases and utterances, that all mean the same concept. And we’re constantly adding to those buckets. So if we use “You guys are the worst” as one of 1000+ different ways a customer could express frustration, we would put that phrase in a bucket with dozens of others like these:

You are the worst.
You guys are awful.
I'm super angry.
I can't stand this.

All those are examples of frustration, so we would call this the Frustration category. And we spend a lot of time auditing those categories to make sure that they're eliminating false positives and false negatives, so that they hit with a very high accuracy level, from company to company. At this point, we have over 1000 of these categories, and it’s growing every day.

Artificial intelligence in conversation analytics is the future

This is just the beginning of the role of artificial intelligence in the conversation analytics space. As technology advances and research deepens, we know there will always be more on the horizon. We can’t wait to go deeper and gain greater insights into the voice of the customer.

So stay tuned: There’s definitely more to come.

To learn more about the Tethr Effort Index (TEI) and how we use AI to measure effort, be sure to check out our resource library. If you enjoyed this breakdown on artificial intelligence and conversation analytics, make sure to check out our recent podcast, Customer Effort: Through an AI Lens. And finally, if you’d like to see our conversation intelligence platform in action, be sure to request a demo to get a closer look.