The limitations of using text analytics to analyze conversations

Ted McKenna

We are often asked about the difference between traditional text analytics tools and the type of conversation analytics Tethr provides. The thinking goes: if audio recordings can be transcribed into text, can’t we just throw that text into the same tools we’ve successfully used in the past to analyze survey verbatims or customer reviews? Won’t that be sufficient for understanding what is happening in our service and support channels?

As shown below in the table, there are some obvious functional reasons this often doesn’t really work all that well. First, the data sets involved in customer conversations tend to be massive and most text analytics engines–even enterprise-grade tools capable of true scale in aggregate–at the unit level are built to understand smaller pieces of text (think a tweet vs a long-form essay). Second, most text analytics engines are built for one-sided feedback channels (what we sometimes refer to as “mono”) whereas conversation analytics are purpose-built for deriving insight from two-sided exchanges between a customer and the agent or rep (i.e. “stereo” data).

Text Analytics vs Conversation Analytics

But on a more fundamental level, the main difference lies in how data is classified and organized. To illustrate, let’s use an example with which we’re all familiar: forests and oceans.

See the ocean for the trees

Imagine for a moment you are a park ranger in charge of managing a forest. It’s your job to keep the forest healthy and all of the trees well maintained. How do you know you’re doing well? You probably start by counting all of the trees, grouping by species. And then track over time how many trees are living, how many new trees are planted, and maybe the rate of growth. 

It’s not an easy job and there are plenty of complicating factors like climate, weather patterns, and a host of animals to consider. But all easily observed and feels reasonably natural. After all, you can see and count each tree – even in large forests, if more time consuming – and there are well-established and identifiable taxonomies you can rely on: an oak tree has oak leaves, an elm has specific seeds, etc. 

Text analytics involves counting keywords

More simply, counting is a relatively straightforward analytic approach. And many traditional text analytics tools aim to bring structure to a qualitative data set by essentially the same process: counting word frequency. The best tools will still add a lot of value in organizing words into hierarchies and topics, serving up trends and patterns. But it’s still primarily about letters and words.

Now let’s change your job. Instead of managing a forest, you’re now an oceanographer tasked with keeping healthy all the plants in a particular sea. It is so big you couldn’t possibly ever physically observe it all. Some parts involve very intricate ecosystems on their own, like those surrounding coral reefs. In fact, oceans are so unexplored (I’ve seen estimates as high as 80% are as-yet-unexplored) you’d have to imagine each observation is much harder to classify precisely because it’s new: just as likely to be a new species or new version of an existing species that you’ve never before encountered.

As such, counting, as an analytic method, can only get you so far. Rather than bucketing all information into predictable and known groups, oceans require processing new information in ways that allow you to understand the relationship it has with what is already known. How does this fit; where does it fit; how should I interpret it; does this new information change any previous interpretations?

Oceans of unstructured data require processing new information in ways that allow you to understand the relationship it has with what is already known

Human speech is complex and complicates analytics

Analyzing the ocean of unstructured data found in conversation requires a similar approach. Human speech is complex and often … unnatural … interspersed with any number of conversational shortcuts (resulting, in effect, in situations like “you get my point” but without explicit articulation). With at least two parties involved, interpretation can change based on a given context. On top of that, converting audio to text often results in mistranscriptions (even from the most accurate ASRs in the world); and asynchronous messaging such as chat and email similarly produce misspelled words and shorthand descriptions increasing interpretation difficulty.

Most text analytics rely on Natural Language Processing which focuses primarily on detecting specific words or, in some cases, basic word groups. This makes sense when counting works. But when aimed at human interactions you end up with a word cloud full of and’s, uh’s, and what’s?. Knowing a lot of customers said the word “what?” might be interesting, but what the heck does it mean? Why did they say it more this month than last? Moreover, analyzing information using word counts often fails to capture the broader context of utterances within conversations and, as a result, can be limited to relatively few concepts.

Beyond counting – scoring concepts for meaning

Tethr’s approach to conversation analytics captures more abstract concepts that are expressed in more complex utterances and combinations of utterances within a conversation. It’s about numbers and concepts. The result is like the difference between detecting a specific sentence and detecting the idea that sentence conveys in every possible way it can be said, in any language.

It’s the difference between detecting a specific sentence and detecting the idea that sentence conveys in every possible way it can be said, in any language.

The most powerful way to classify concepts, rather than counting words, is to use scores. Scoring full interactions pairs concepts with meaning, building an understanding of relationships to larger business goals such as customer loyalty or agent performance.

Tethr + text analytics

At Tethr, we score data on three levels:

  1. Scoring concepts (categories) in conversation data: This is mapping unstructured data to concepts. The Tethr library categories reflect years of training a machine to recognize whole concepts in syntax, as represented by compilations of phrases and utterances. We measure the strength of the relationship between phrase and concept using scores that range from 0 – 1 (with 1 being high) on two parameters: a) precision, which reflects possible false positives; and b) recall, which reflects possible false negatives. Give Tethr any new phrase – any given utterance – and we can quantify within seconds the strength of its relationship to any one of hundreds of known concepts.
  1. Score the relationship between an interaction and individual concepts/categories: This is mapping concepts to interactions. Tethr measures the degree to which each concept is detected as present across each individual interaction. Some concepts like Customer Frustration can be articulated in thousands of different ways. But new articulations do occur, some very specific to a company or industry. And leaders require accurate classification to make confident decisions. Tethr users can easily add and modify phrases and utterances (captured within calls/messages), at which point the machine learning automatically serves up new similar-but-not-exact matches to any new phrases. Again, on a numerical scale, one can view the similarity of meaning amongst a large number of phrases. This allows us to accurately detect entirely new utterances that represent a given category, without ever having to create a rule. As a result, our system can generalize from examples of a category, rather than simply identify what it has been trained to recognize.
  1. Score entire interactions to reflect the relationship to structured outcomes: Scoring the entire interaction allows Tethr to unearth context hidden in combinations of sequences occurring across the course of a conversation, and scores also reflect the relative intensity of given events. It’s common for some individual variables to have a positive effect when paired with certain other variables, and then have a negative effect when paired with others. This type of nuanced understanding isn’t about counting–it’s about insight packed into combinations and the strength of different relationships. The Tethr Effort Index (TEI) and Agent Impact Score (AIS), applied to all interactions flowing through Tethr, are built on a scale from 0 – 10 and predict how customers would have responded to the Customer Effort Score (CES) survey, based on interaction events and, where applicable, several audio features. Tethr customers use TEI and AIS to easily find the best and worst interactions, using such filters as a way to compare and contrast category frequency associated with each end of the spectrum. With this type of wide-scale, predictive and accurate understanding of behavior, automated actions can be set up to create important alerts, drive workflows, improve escalation processes, and enact targeted and tailored customer feedback loops.

Want to learn more about how Tethr analyzes conversations? Just click the image below to schedule a demo.

text analytics blog tethr