Forrester Report: The State of Conversational AI Read the report —>

Moving from Natural Language Understanding (NLU) to Customer-Facing AI Assistants

There can no longer be any doubt that large language models and generative AI more broadly are going to have a real impact on many industries. Though we’re still in the preliminary stages of working out the implications, the evidence so far suggests that this is already happening.

Language models in contact centers are helping to more junior workers be more productive, and reducing employee turnover in the process. They’re also being used to automate huge swathes of content creation, assisting in data augmentation tasks, and plenty else besides.

Part of the task we’ve set ourselves here at Quiq is explaining how these models are trained and how they’ll make their way into the workflows of the future. To that end, we’ve written extensively about how large language models are trained, how researchers are pushing them into uncharted territories, and which models are appropriate for any given task.

This post is another step in that endeavor. Specifically, we’re going to discuss natural language understanding, how it works, and how it’s distinct from related terms (like “natural language generation”). With that done, we’ll talk about how natural language understanding is a foundational first step and takes us closer to creating robust customer-facing AI assistants.

What is Natural Language Understanding?

Language is a tool of remarkable power and flexibility – so much so that it wouldn’t be much of an exaggeration to say that it’s at the root of everything else the human race has accomplished. From towering works of philosophy to engineering specs to instructions for setting up a remote, language is a force multiplier that makes each of us vastly more effective than we otherwise would be.

Evidence of this claim comes from the fact that, even when we’re alone, many of us think in words or even talk to ourselves as we work through something difficult. Certain kinds of thoughts are all but impossible to have without the scaffolding provided by language.

For all these reasons, creating machines able to parse natural language has long been a goal of AI researchers and computer scientists. The field that has been established to address itself to this task is known as natural language understanding.

There’s a rather deep philosophical here where the word “understanding” is concerned. As the famous story of the Tower of Babel demonstrates, it isn’t enough for the members of a group to be making sounds to accomplish great things, it’s also necessary for the people involved to understand what everyone is saying. This means that when you say a word like “chicken” there’s a response in my nervous system such that the “chicken” concept is activated, along with other contextually relevant knowledge, such as the location of the chicken feed. If you said “курица” (to someone who doesn’t know Russian) or “鸡” (to someone who doesn’t know Mandarin), the same process wouldn’t have occurred, no understanding would’ve happened, and language wouldn’t have helped at all.

Whether and how a machine can understand language fully humanly is too big a topic to address here, but we can make some broad comments. As is often the case, researchers in the field of natural language understanding have opted to break the problem down into much more tractable units. Two of the biggest such units of natural language understanding are intent recognition (what a sentence is intended to accomplish) and entity recognition (who the sentence is referring to).

This should make a certain intuitive sense. Though you may not be consciously going through a mental checklist when someone says something to you, on some level, you’re trying to figure out what their goal is and who or what they’re talking about. The intent behind the sentence “John has an apple”, for example, is to inform you of a fact about the world, and the main entities are “John” and “apple”. If you know John, a little image of him holding an apple would probably pop into your head.

This has many obvious applications to the work done in contact centers. If you’re building an automated ticket classification system, for instance, it would help to be able to tell whether the intent behind the ticket is to file a complaint, reach a representative, or perform a task like resetting a password. It would also help to be able to categorize the entities, like one of a dozen products your center supports, that are being referred to.

Natural Language Understanding v.s. Natural Language Processing

Natural language understanding is its own field, and it’s easy to confuse it with other, related fields, like natural language processing.

Most of the sources we consulted consider natural language understanding to be a subdomain of natural language processing (NLP). Whereas the former is concerned with parsing natural language into a format that machines can work with, the latter subsumes this task, along with others like machine translation and natural language generation.

Natural Language Understanding v.s. Natural Language Generation

Speaking of natural language generation, many people also confuse natural language understanding and natural language generation. Natural language generation is more or less what it sounds like using computers to generate human-sounding text or speech.

Natural language understanding can be an important part of getting natural language generation right, but they’re not the same thing.

Customer-Facing AI Assistants

Now that we’ve discussed natural language understanding, let’s talk about how it can be utilized in the attempt to create high-quality customer-facing AI assistants.

How Can Natural Language Understand Be Used to Make Customer-Facing Assistants?

Natural language understanding refers to a constellation of different approaches to decomposing language into pieces that a machine can work with. This allows an algorithm to discover the intent in a message, tag parts of speech (nouns, verbs, etc.), or pull out the entities referenced.

All of this is an important part of building effective customer-facing AI assistants. At Quiq, we’ve built LLM-powered knowledge assistants able to answer common questions across your reference documentation, data assistants that can use CRM and order management systems to provide actionable insights, and other kinds of conversational AI systems. Though we draw on many technologies and research areas, none of this would be possible without natural language understanding.

What are the Benefits of Customer-Facing AI Assistants?

The reason people have been working so long to create powerful customer-facing AI assistants is that there are so many benefits involved.

At a contact center, agents spend most of their day answering questions, resolving issues, and otherwise making sure a customer base can use a set of product offerings as intended.

As with any job, some of these tasks are higher-value than others. All of the work is important, but there will always be subtle and thorny issues that only a skilled human can work through, while others are quotidian and can be farmed out to a machine.

This is a long way of saying that one of the major benefits of customer-facing AI assistants is that they free up your agents to specialize at handling the most pressing requests, with password resets or something similar handled by a capable product like the Quiq platform.

A related benefit is improved customer experience. When agents can focus their efforts they can spend more time with customers who need it. And, when you have properly fine-tuned language models interacting with customers, you’ll know that they’re unfailingly polite and helpful because they’ll never become annoyed after a long shift the way a human being might.

Robust Costumer-Facing AI Assistants with Quiq

Just as understanding has been such a crucial part of the success of our species, it’ll be an equally crucial part of the success of advanced AI tooling.

One way you can make use of bleeding-edge natural language understanding techniques is by building your language models. This would require you to hire teams of extremely smart engineers. But this would be expensive; besides their hefty salaries, you’d also have to budget to keep the fridge stocked with the sugar-free Red Bulls such engineers require to function.

Or, you could utilize the division of labor. Just as contact center agents can outsource certain tasks to machines, so too can you outsource the task of building an AI-based CX platform to Quiq. Set up a demo today to see what our advanced AI technology and team can do for your contact center!

Request A Demo

Reinforcement Learning from Human Feedback

ChatGPT – and other large language models like it – are already transforming education, healthcare, software engineering, and the work being done in contact centers.

We’ve written extensively about how self-supervised learning is used to train these models, but one thing we haven’t spent much time on is reinforcement learning from human feedback (RLHF).

Today, we’re rectifying that. We’re going to dive into what reinforcement learning from human feedback is, why it’s important, and how it works.

With that done, you’ll have received a thorough education in this world-changing technology.

What is Reinforcement Learning from Human Feedback?

As you no doubt surmised from its name, reinforcement learning from human feedback involves two components: reinforcement learning and human feedback. Though the technical specifics are (as usual) very involved, the basic idea is simple: you have models produce output, humans rate the output that they prefer (based on its friendliness, completeness, accuracy, etc.), and then the model is updated accordingly.

It’ll help if we begin by talking about what reinforcement learning is. This background will prove useful in understanding the unfolding of the broader process.

What is Reinforcement Learning?

There are four widespread approaches to getting intelligent behavior from machines: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.

With supervised learning, you feed a statistical algorithm a bunch of examples of correctly-labeled data in the hope that it will generalize to further examples it hasn’t seen before. Regression and supervised classification models are standard applications of supervised learning.

Unsupervised learning is a similar idea, but you forego the labels. It’s used for certain kinds of clustering tasks, and for applications like dimensionality reduction.

Semi-supervised learning is a combination of these two approaches. Suppose you have a gigantic body of photographs, and you want to develop an automated system to tag them. If some of them are tagged then your system can use those tags to learn a pattern, which can then be applied to the rest of the untagged images.

Finally, there’s reinforcement learning (RL). Reinforcement learning is entirely different. With reinforcement learning, you’re usually setting up an environment (like a video game), and putting an agent in the environment with a reward structure that tells it which actions are good and which are bad. If the agent successfully flies a spaceship through a series of rings, for example, that might be worth +10 points each, completing an entire level might be worth +100, crashing might be worth -1,000, and so on.

The idea is that, over time, the reinforcement learning agent will learn to execute a strategy that maximizes its long-term reward. It’ll realize that rings are worth a few points and so it should fly through them, it’ll learn that it should try to complete a level because that’s a huge reward bonus, it’ll learn that crashing is bad, etc.

Reinforcement learning is far more powerful than other kinds of machine learning; when done correctly, it can lead to agents able to play the stock market, run procedures in a factory, and do a staggering variety of other tasks.

What are the Steps of Reinforcement Learning from Human Feedback?

Now that we know a little bit about reinforcement learning, let’s turn to a discussion of reinforcement learning from human feedback.

As we just described, reinforcement learning agents have to be trained like any other machine learning system. Under normal circumstances, this doesn’t involve any human feedback. A programmer will update the code, environment, or reward structure between training runs, but they don’t usually provide feedback directly to the agent.

Except, that is, in the case of reinforcement learning from human feedback, in which case that’s exactly what happens. A model will produce a set of outputs, and humans will rank them. Over time the model will adjust to making more and more appropriate responses, as judged by the human raters providing them with feedback.

Sometimes, this feedback can be for something relatively prosaic. It’s been used, for example, to get RL agents to execute backflips in simulated environments. The raters will look at short videos of two movements and select the one that looks like it’s getting closer to a backflip; with enough time, this gets the agent to actually do one.

Or, it can be used for something more nuanced, such as getting a large language model to produce more conversational dialogue. This is part of how ChatGPT was trained.

Why is Reinforcement Learning from Human Feedback Necessary?

ChatGPT is already being used to great effect in contact centers and the customer service arena more broadly. Here are some example applications:

  • Question answering: ChatGPT is exceptionally good at answering questions. What’s more, some companies have begun fine-tuning it on their own internal and external documentation, so that people can directly ask it questions about how a product works or how to solve an issue. This obviates the need to go hunting around inside the docs.
  • Summarization: Similarly, ChatGPT can be used to summarize video transcripts, email threads, and lengthy articles so that agents (or customers) can get through the material at a much greater clip. This can, for example, help agents stay abreast of what’s going on in other parts of the company without burdening them with the need to read constantly. Quiq has custom-built tools for performing exactly this function.
  • Onboarding new hires: Together, question-answering and summarization are helping new contact center agents get up to speed much more quickly when they start their jobs.
    Sentiment analysis: Sentiment analysis refers to classifying a text according to its sentiment, i.e. whether it’s “positive”, “negative”, or “neutral”. Sentiment analysis comes in several different flavors, including granular and aspect-spaced, and ChatGPT can help with all of them. Being able to automatically tag a customer issue comes in handy when you’re trying to sort and prioritize them.
  • Real-time language translation: If your product or service has an international audience, then you might need to avail yourself of translation services so that agents and customers are speaking the same language. There are many such services available, but ChatGPT has proven to be at least as good as almost all of them.

In aggregate, these and other use cases of large language models are making contact center agents much more productive. But contact center agents have to interact with customers in a certain way – they have to be polite, helpful, etc.

And out of the box, most large language models do not behave that way. We’ve already had several high-profile incidents in which a language model e.g. asked a reporter to end his marriage or falsely accused a law school professor of sexual harassment.

Reinforcement learning from human feedback is currently the most promising approach for tuning this toxic and harmful behavior out of large language models. The only reason they’re able to help contact center agents so much is that they’ve been fine-tuned with such an approach; otherwise, agents would be spending an inordinate amount of time rephrasing and tinkering with a model’s output to get it to be appropriately friendly.

This is why reinforcement learning from human feedback is important for the managers of contact centers to understand – it’s a major part of why large language models are so useful in the first place.

Applications of Reinforcement Learning from Human Feedback

To round out our picture, we’re going to discuss a few ways in which reinforcement learning from human feedback is actually used in the wild. We’ve already discussed how it is fine-tuning models to be more helpful in the context of a contact center, and we’ll now talk a bit about how it’s used in gaming and robotics.

Using Reinforcement Learning from Human Feedback in Games

Gaming has long been one of the ideal testing grounds for new approaches to artificial intelligence. As you might expect, it’s also a place where reinforcement learning from human feedback has been successfully applied.

OpenAI used it to achieve superhuman performance on a classic Atari game, Enduro. Enduro is an old-school racing game, and like all racing games, the point is to gradually pass the other cars without hitting them or going out of bounds in the game.

It’s exceptionally difficult for an agent to learn to play Enduro will using only standard reinforcement learning approaches. But when human feedback is added, the results shift dramatically.

Using Reinforcement Learning from Human Feedback in Robotics

Because robotics almost always involves an agent interacting with the physical world, it’s especially well-suited to reinforcement learning from human feedback.

Often, it can be difficult to get a robot to execute a long series of steps that achieves a valuable reward, especially when the intermediate steps aren’t themselves very valuable. What’s more, it can be especially difficult to build a reward structure that correctly incentivizes the agent to execute the intermediate steps in the right order.

It’s much simpler instead to have humans look at sequences of actions and judge for themselves which will get the agent closer to its ultimate goal.

RLHF For The Contact Center Manager

Having made it this far, you should be in a much better position to understand how reinforcement learning from human feedback works, and how it contributes to the functioning of your contact centers.

If you’ve been thinking about leveraging AI to make yourself or your agents more effective, set up a demo with the Quiq team to see how we can put our cutting-edge models to work for you. We offer both customer-facing and agent-facing tools, all of them designed to help you make customers happier while reducing agent burnout and turnover.

Request A Demo

What are the Biggest Questions About AI?

The term “artificial intelligence” was coined at the famous Dartmouth Conference in 1956, put on by luminaries like John McCarthy, Marvin Minsky, and Claude Shannon, among others.

These organizers wanted to create machines that “use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.” They went on to claim that “…a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.”

Half a century later, it’s fair to say that this has not come to pass; brilliant as they were, it would seem as though McCarthy et al. underestimated how difficult it would be to scale the heights of the human intellect.

Nevertheless, remarkable advances have been made over the past decade, so much so that they’ve ignited a firestorm of controversy around this technology. People are questioning the ways in which it can be used negatively, and whether it might ultimately pose an extinction risk to humanity; they’re probing fundamental issues around whether machines can be conscious, exercise free will, and think in the way a living organism does; they’re rethinking the basis of intelligence, concept formation, and what it means to be human.

These are deep waters to be sure, and we’re not going to swim them all today. But as contact center managers and others begin the process of thinking about using AI, it’s worth being at least aware of what this broader conversation is about. It will likely come up in meetings, in the press, or in Slack channels in exchanges between employees.

And that’s the subject of our piece today. We’re going to start by asking what artificial intelligence is and how it’s being used, before turning to address some of the concerns about its long-term potential. Our goal is not to answer all these concerns, but to make you aware of what people are thinking and saying.

What is Artificial Intelligence?

Artificial intelligence is famous for having had many, many definitions. There are those, for example, who believe that in order to be intelligent computers must think like humans, and those who reply that we didn’t make airplanes by designing them to fly like birds.

For our part, we prefer to sidestep the question somewhat by utilizing the approach taken in one of the leading textbooks in the field, Stuart Russell and Peter Norvig’s “Artificial Intelligence: A Modern Approach”.

They propose a multi-part system for thinking about different approaches to AI. One set of approaches is human-centric and focuses on designing machines that either think like humans – i.e., engage in analogous cognitive and perceptual processes – or act like humans – i.e. by behaving in a way that’s indistinguishable from a human, regardless of what’s happening under the hood (think: the Turing Test).

The other set of approaches is ideal-centric and focuses on designing machines that either think in a totally rational way – conformant with the rules of Bayesian epistemology, for example – or behave in a totally rational way – utilizing logic and probability, but also acting instinctively to remove itself from danger, without going through any lengthy calculations.

What we have here, in other words, is a framework. Using the framework not only gives us a way to think about almost every AI project in existence, it also saves us from needing to spend all weekend coming up with a clever new definition of AI.

Joking aside, we think this is a productive lens through which to view the whole debate, and we offer it here for your information.

What is Artificial Intelligence Good For?

Given all the hype around ChatGPT, this might seem like a quaint question. But not that long ago, many people were asking it in earnest. The basic insights upon which large language models like ChatGPT are built go back to the 1960s, but it wasn’t until 1) vast quantities of data became available, and 2) compute cycles became extremely cheap that much of its potential was realized.

Today, large language models are changing (or poised to change) many different fields. Our audience is focused on contact centers, so that’s what we’ll focus on as well.

There are a number of ways that generative AI is changing contact centers. Because of its remarkable abilities with natural language, it’s able to dramatically speed up agents in their work by answering questions and formatting replies. These same abilities allow it to handle other important tasks, like summarizing articles and documentation and parsing the sentiment in customer messages to enable semi-automated prioritization of their requests.

Though we’re still in the early days, the evidence so far suggests that large language models like Quiq’s conversational CX platform will do a lot to increase the efficiency of contact center agents.

Will AI be Dangerous?

One thing that’s burst into public imagination recently has been the debate around the risks of artificial intelligence, which fall into two broad categories.

The first category is what we’ll call “social and political risks”. These are the risks that large language models will make it dramatically easier to manufacture propaganda at scale, and perhaps tailor it to specific audiences or even individuals. When combined with the astonishing progress in deepfakes, it’s not hard to see how there could be real issues in the future. Most people (including us) are poorly equipped to figure out when a video is fake, and if the underlying technology gets much better, there may come a day when it’s simply not possible to tell.

Political operatives are already quite skilled at cherry-picking quotes and stitching together soundbites into a damning portrait of a candidate – imagine what’ll be possible when they don’t even need to bother.

But the bigger (and more speculative) danger is around really advanced artificial intelligence. Because this case is harder to understand, it’s what we’ll spend the rest of this section on.

Artificial Superintelligence and Existential Risk

As we understand it, the basic case for existential risk from artificial intelligence goes something like this:

“Someday soon, humanity will build or grow an artificial general intelligence (AGI). It’s going to want things, which means that it’ll be steering the world in the direction of achieving its ambitions. Because it’s smart, it’ll do this quite well, and because it’s a very alien sort of mind, it’ll be making moves that are hard for us to predict or understand. Unless we solve some major technological problems around how to design reward structures and goal architectures in advanced agentive systems, what it wants will almost certainly conflict in subtle ways with what we want. If all this happens, we’ll find ourselves in conflict with an opponent unlike any we’ve faced in the history of our species, and it’s not at all clear we’ll prevail.”

This is heady stuff, so let’s unpack it bit by bit. The opening sentence, “…humanity will build or grow an artificial general intelligence”, was chosen carefully. If you understand how LLMs and deep learning systems are trained, the process is more akin to growing an enormous structure than it is to building one.

This has a few implications. First, their internal workings remain almost completely inscrutable. Though researchers in fields like mechanistic interpretability are going a long way toward unpacking how neural networks function, the truth is, we’ve still got a long way to go.

What this means is that we’ve built one of the most powerful artifacts in the history of Earth, and no one is really sure how it works.

Another implication is that no one has any good theoretical or empirical reason to bound the capabilities and behavior of future systems. The leap from GPT-2 to GPT-3.5 was astonishing, as was the leap from GPT-3.5 to GPT-4. The basic approach so far has been to throw more data and more compute at the training algorithms; it’s possible that this paradigm will begin to level off soon, but it’s also possible that it won’t. If the gap between GPT-4 and GPT-5 is as big as the gap between GPT-3 and GPT-4, and if the gap between GPT-6 and GPT-5 is just as big, it’s not hard to see that the consequences could be staggering.

As things stand, it’s anyone’s guess how this will play out. But that’s not necessarily a comforting thought.

Next, let’s talk about pointing a system at a task. Does ChatGPT want anything? The short answer is: as far as we can tell, it doesn’t. ChatGPT isn’t an agent, in the sense that it’s trying to achieve something in the world, but work into agentive systems is ongoing. Remember that 10 years ago most neural networks were basically toys, and today we have ChatGPT. If breakthroughs in agency follow a similar pace (and they very well may not), then we could have systems able to pursue open-ended courses of action in the real world in relatively short order.

Another sobering possibility is that this capacity will simply emerge from the training of huge deep learning systems. This is, after all, the way human agency emerged in the first place. Through the relentless grind of natural selection, our ancestors went from chipping flint arrowheads to industrialization, quantum computing, and synthetic biology.

To be clear, this is far from a foregone conclusion, as the algorithms used to train large language models is quite different from natural selection. Still, we want to relay this line of argumentation, because it comes up a lot in these discussions.

Finally, we’ll address one more important claim, “…what it wants will almost certainly conflict in subtle ways with what we want.” Why think this is true? Aren’t these systems that we design and, if so, can’t we just tell it what we want it to go after?

Unfortunately, it’s not so simple. Whether you’re talking about reinforcement learning or something more exotic like evolutionary programming, the simple fact is that our algorithms often find remarkable mechanisms by which to maximize their reward in ways we didn’t intend.

There are thousands of examples of this (ask any reinforcement-learning engineer you know), but a famous one comes from the classic Coast Runners video game. The engineers who built the system tried to set up the algorithm’s rewards so that it would try to race a boat as well as it could. What it actually did, however, was maximize its reward by spinning in a circle to hit a set of green blocks over and over again.

biggest questions about AI

Now, this may seem almost silly – do we really have anything to fear from an algorithm too stupid to understand the concept of a “race”?

But this would be missing the thrust of the argument. If you had access to a superintelligent AI and asked it to maximize human happiness, what happened next would depend almost entirely on what it understood “happiness” to mean.

If it were properly designed, it would work in tandem with us to usher in a utopia. But if it understood it to mean “maximize the number of smiles”, it would be incentivized to start paying people to get plastic surgery to fix their faces into permanent smiles (or something similarly unintuitive).

Does AI Pose an Existential Risk?

Above, we’ve briefly outlined the case that sufficiently advanced AI could pose a serious risk to humanity by being powerful, unpredictable, and prone to pursuing goals that weren’t-quite-what-we-meant.

So, does this hold water? Honestly, it’s too early to tell. The argument has hundreds of moving parts, some well-established and others much more speculative. Our purpose here isn’t to come down on one side of this debate or the other, but to let you know (in broad strokes) what people are saying.

At any rate, we are confident that the current version of ChatGPT doesn’t pose any existential risks. On the contrary, it could end up being one of the greatest advancements in productivity ever seen in contact centers. And that’s what we’d like to discuss in the next section.

Will AI Take All the Jobs?

The concern that someday a new technology will render human labor obsolete is hardly new. It was heard when mechanized weaving machines were created, when computers emerged, when the internet emerged, and when ChatGPT came onto the scene.

We’re not economists and we’re not qualified to take a definitive stand, but we do have some early evidence that is showing that large language models are not only not resulting in layoffs, they’re making agents much more productive.

Erik Brynjolfsson, Danielle Li, and Lindsey R. Raymond, three MIT economists, looked at the ways in which generative AI was being used in a large contact center. They found that it was actually doing a good job of internalizing the ways in which senior agents were doing their jobs, which allowed more junior agents to climb the learning curve more quickly and perform at a much higher level. This had the knock-on effect of making them feel less stressed about their work, thus reducing turnover.

Now, this doesn’t rule out the possibility that GPT-10 will be the big job killer. But so far, large language models are shaping up to be like every prior technological advance, i.e., increasing employment rather than reducing it.

What is the Future of AI?

The rise of AI is raising stock valuations, raising deep philosophical questions, and raising expectations and fears about the future. We don’t know for sure how all this will play out, but we do know contact centers, and we know that they stand to benefit greatly from the current iteration of large language models.

These tools are helping agents answer more queries per hour, do so more thoroughly, and make for a better customer experience in the process.

If you want to get in on the action, set up a demo of our technology today.

Request A Demo

What is Sentiment Analysis? – Ultimate Guide

A person only reaches out to a contact center when they’re having an issue. They can’t get a product to work the way they need it to, for example, or they’ve been locked out of their account.

The chances are high that they’re frustrated, angry, or otherwise in an emotionally-fraught state, and this is something contact center agents must understand and contend with.

The term “sentiment analysis” refers to the field of machine learning which focuses on developing algorithmic ways of detecting emotions in natural-language text, such as the messages exchanged between a customer and a contact center agent.

Making it easier to detect, classify, and prioritize messages on the basis of their sentiment is just one of many ways that technology is revolutionizing contact centers, and it’s the subject we’ll be addressing today.

Let’s get started!

What is Sentiment Analysis?

Sentiment analysis involves using various approaches to natural language processing to identify the overall “sentiment” of a piece of text.

Take these three examples:

  1. “This restaurant is amazing. The wait staff were friendly, the food was top-notch, and we had a magnificent view of the famous New York skyline. Highly recommended.”
  2. “Root canals are never fun, but it certainly doesn’t help when you have to deal with a dentist as unprofessional and rude as Dr. Thomas.”
  3. “Toronto’s forecast for today is a high of 75 and a low of 61 degrees.”

Humans excel at detecting emotions, and it’s probably not hard for you to see that the first example is positive, the second is negative, and the third is neutral (depending on how you like your weather.)

There’s a greater challenge, however, in getting machines to make accurate classifications of this kind of data. How exactly that’s accomplished is the subject of the next section, but before we get to that, let’s talk about a few flavors of sentiment analysis.

What Types of Sentiment Analysis Are There?

It’s worth understanding the different approaches to sentiment analysis if you’re considering using it in your contact center.

Above, we provided an example of positive, negative, and neutral text. What we’re doing there is detecting the polarity of the text, and as you may have guessed, it’s possible to make much more fine-grained delineations of textual data.

Rather than simply detecting whether text is positive or negative, for example, we might instead use these categories: very positive, positive, neutral, negative, and very negative.

This would give us a better understanding of the message we’re looking at, and how it should be handled.

Instead of classifying text by its polarity, we might also use sentiment analysis to detect the emotions being communicated – rather than classifying a sentence as being “positive” or “negative”, in other words, we’d identify emotions like “anger” or “joy” contained in our textual data.

This is called “emotion detection” (appropriately enough), and it can be handled with long short-term memory (LSTM) or convolutional neural network (CNN) models.

Another, more granular approach to sentiment analysis is known as aspect-based sentiment analysis. It involves two basic steps: identifying “aspects” of a piece of text, then identifying the sentiment attached to each aspect.

Take the sentence “I love the zoo, but I hate the lines and the monkeys make fun of me.” It’s hard to assign an overall sentiment to the sentence – it’s generally positive, but there’s kind of a lot going on.

If we break out the “zoo”, “lines”, and “monkeys” aspects, however, we can see that there’s the positive sentiment attached to the zoo, and negative sentiment attached to the lines and the abusive monkeys.

Why is Sentiment Analysis Important?

It’s easy to see how aspect-based sentiment analysis would inform marketing efforts. With a good enough model, you’d be able to see precisely which parts of your offering your clients appreciate, and which parts they don’t. This would give you valuable information in crafting a strategy going forward.

This is true of sentiment analysis more broadly, and of emotion detection too.
You need to know what people are thinking, saying, and feeling about you and your company if you’re going to meet their needs well enough to make a profit.

Once upon a time, the only way to get these data was with focus groups and surveys. Those are still utilized, of course. But in the social media era, people are also not shy about sharing their opinions online, in forums, and similar outlets.

These oceans of words from an invaluable resource if you know how to mine them. When done correctly, sentiment analysis offers just the right set of tools for doing this at scale.

Challenges with Sentiment Analysis

Sentiment analysis confers many advantages, but it is not without its challenges. Most of these issues boil down to handling subtleties or ambiguities in language.

Consider a sentence like “This is a remarkable product, but still not worth it at that price.” Calling a product “remarkable” is a glowing endorsement, tempered somewhat by the claim that its price is set too high. Most basic sentiment classifiers would probably call this “positive”, but as you can see, there are important nuances.

Another issue is sarcasm.

Suppose we showed you a sentence like “This movie was just great, I loved spending three hours of my Sunday afternoon following a story that could’ve been told in twenty minutes.”

A sentiment analysis algorithm is likely going to pick up on “great” and “loved” when calling this sentence positive.

But, as humans, we know that these are backhanded compliments meant to communicate precisely the opposite message.

Machine-learning systems will also tend to struggle with idioms that we all find easy to parse, such as “Setting up my home security system was a piece of cake.” This is positive because “piece of cake” means something like “couldn’t have been easier”, but an algorithm may or may not pick up on that.

Finally, we’ll mention the fact that much of the text in product reviews will contain useful information that doesn’t fit easily into a “sentiment” bucket. Take a sentence like “The new iPhone is smaller than the new Android.” This is just a bare statement of physical facts, and whether it counts as positive or negative depends a lot on what a given customer is looking for.

There are various ways of trying to ameliorate these issues, most of which are outside the scope of this article. For now, we’ll just note that sentiment analysis needs to be approached carefully if you want to glean an accurate picture of how people feel about your offering from their textual reviews. So long as you’re diligent about inspecting the data you show the system and are cautious in how you interpret the results, you’ll probably be fine.

Two people review data on a paper and computer to anticipate customer needs.

How Does Sentiment Analysis Work?

Now that we’ve laid out a definition of sentiment analysis, talked through a few examples, and made it clear why it’s so important, let’s discuss the nuts and bolts of how it works.

Sentiment analysis begins where all data science and machine learning projects begin: with data. Because sentiment analysis is based on textual data, you’ll need to utilize various techniques for preprocessing NLP data. Specifically, you’ll need to:

  • Tokenize the data by breaking sentences up into individual units an algorithm can process;
  • Use either stemming or lemmatization to turn words into their root form, i.e. by turning “ran” into “run”;
  • Filter out stop words like “the” or “as”, because they don’t add much to the text data.

Once that’s done, there are two basic approaches to sentiment analysis. The first is known as “rule-based” analysis. It involves taking your preprocessed textual data and comparing it against a pre-defined lexicon of words that have been tagged for sentiment.

If the word “happy” appears in your text it’ll be labeled “positive”, for example, and if the word “difficult” appears in your text it’ll be labeled “negative.”

(Rules-based sentiment analysis is more nuanced than what we’ve indicated here, but this is the basic idea.)

The second approach is based on machine learning. A sentiment analysis algorithm will be shown many examples of labeled sentiment data, from which it will learn a pattern that can be applied to new data the algorithm has never seen before.

Of course, there are tradeoffs to both approaches. The rules-based approach is relatively straightforward, but is unlikely to be able to handle the sorts of subtleties that a really good machine-learning system can parse.

Though machine learning is more powerful, however, it’ll only be as good as the training data it has been given; what’s more, if you’ve built some monstrous deep neural network, it might fail in mysterious ways or otherwise be hard to understand.

Supercharge Your Contact Center with Generative AI

Like used car salesmen or college history teachers, contact center managers need to understand the ways in which technology will change their business.

Machine learning is one such profoundly-impactful technology, and it can be used to automatically sort incoming messages by sentiment or priority and generally make your agents more effective.

Realizing this potential could be as difficult as hiring a team of expensive engineers and doing everything in-house, or as easy as getting in touch with us to see how we can integrate the Quiq conversational AI platform into your company.

If you want to get started quickly without spending a fortune, you won’t find a better option than Quiq.

Request A Demo

How Large Language Models Have Evolved

In late 2022, large language models (LLMs) exploded into public awareness almost overnight. But like most overnight sensations, the history of large language models is long, fascinating, and informative.

In this piece, we’ll trace the deep evolution of language models and use this as a lens into how they can change your contact center today–and in the future.

Let’s get started!

A Brief History of Artificial Intelligence Development

The human fascination with building artificial beings capable of thought and action goes back a long way. Writing in roughly the 8th century BCE, Homer recounted tales of the Greek god Hephaestus outsourcing repetitive manual tasks to automated bellows and working alongside robot-like “attendants” that were “…golden, and in appearance like living young women.”

Some 500 years later, mathematicians in Alexandria would produce treatises on creating mechanical servants and various kinds of automata. Heron wrote a technical manual for producing a mechanical shrine and an automated theater whose figurines could stage a full tragic play.

Nor is it only ancient Greece that tells similar tales. Jewish legends speak of the Golem, a being made of clay and imbued with life and agency through language. The word “abracadabra”, in fact, comes from the Aramaic phrase “avra k’davra,” which translates to “I create as I speak.”

Through the ages, these old ideas have found new expression in stories such as “The Sorcerer’s Apprentice,” Mary Shelley’s “Frankenstein,” and Karel Čapek’s “R.U.R.,” a science fiction play that features the first recorded use of the word “robot.”

From Science Fiction to Science Fact

But they remained purely fiction until the early 20th Century – a pivotal moment in the history of LLMs – when advances in the theory of computation and the development of primitive computers began to offer a path to building intelligent systems.

Arguably, this really began in earnest with the 1950 publication of Alan Turing’s “Computing Machinery and Intelligence” – in which he proposed the famous “Turing test” – and with the 1956 Dartmouth conference on AI, organized by luminaries John McCarthy and Marvin Minsky.

People began taking AI seriously. Over the next ~50 years in the evolution of large language models, there were numerous periods of hype and exuberance in which major advances were made and long “AI winters” in which funding dried up, and little was accomplished.

Three advances acted to really bring LLMs into their own: the development of neural networks, the deep learning revolution, and the rise of big data. These are important for understanding the history of large language models, so it’s to these that we now turn.

Neural Networks and the Deep Learning Revolution

Walter Pitts and Warren McCulloch laid the groundwork for the eventual evolution of language models in the early 1940s. Inspired by the burgeoning study of the human brain, they wondered if it would be possible to build an artificial neuron with some of the same basic properties as a biological one.

They were successful, though several other breakthroughs would be required before artificial neurons could be arranged into systems capable of doing useful work. One such breakthrough was the discovery of backpropagation in 1960, the basic algorithm still used to train deep learning systems.

It wasn’t until 1985, however, that David Rumelhart, Ronald Williams, and Geoff Hinton used backpropagation in neural networks; in 1989, this allowed Yann LeCun to train such a network to recognize handwritten digits.

Ultimately, it would be these deep neural networks (DNNs) that would emerge from the history of LLMs as the dominant paradigm, but for completeness, we should briefly mention some of the methods that it replaced.

One was known as “rule-based approaches,” and it was exactly what it sounded like. Early AI assistants would be programmed directly with grammatical rules, which were used to parse text and craft responses. This was just as limiting as you’d imagine, and the approach is rarely seen today except in the most straightforward of cases.

Then, there were statistical language models, which bear at least a passing resemblance to the behemoth LLMs that came later. These models try to predict the probability of word n given the n-1 words that came before. If you read our deep dive on LLMs, this will sound familiar, though it was not at all as powerful and flexible as what’s available today.

There were others that are beyond the scope of this treatment, but the key takeaway is that gargantuan neural networks ended up winning the day.

To close this section out, we’ll mention a handful of architectural improvements that came out of this period and would play a crucial role in the evolution of language models. We’ll focus on two in particular: transformers and word vector embeddings.

If you’ve investigated how LLMs work, you’ve probably heard both terms. Transformers are famously intricate, but the basic idea is that they creatively combined elements of predecessor architectures to ameliorate the problems those approaches faced. Specifically, they can use self-attention to selectively attend to key pieces of information in text, allowing them to render higher-fidelity translations and higher-quality text generations.

Word vector embeddings are numerical representations of words that capture underlying semantic information. When interacting with ChatGPT, it can be easy to forget that computers don’t actually understand language, they understand numbers. A word vector embedding is an array of numbers generated with one of several different algorithms, with similar words having similar embeddings. LLMs can process these embeddings to learn enormous statistical patterns in unstructured linguistic data, then use those patterns to generate their own outputs.

All of this research went into making the productive neural networks that are currently changing the nature of work in places like contact centers. The last missing piece was data, which we’ll cover in the next section.

The Big Data Era

Neural networks and deep-learning applications tend to be extremely data-hungry, and access to quality training data has always been a major bottleneck. In 2009 Stanford’s Fei-Fei Li sought to change this by releasing Imagenet, a database of over 14 million labeled images that could be used for free by researchers. The increase in available data, together with substantial improvements in computer hardware like graphical processing units (GPUs), meant that at long last the promise of deep learning could begin to be fulfilled.

And it was. In 2011, a convolutional neural network called “AlexNet” won multiple international competitions for image recognition, IBM’s Watson system beat several Jeopardy! all-stars in a real game, and Apple launched Siri. Amazon’s Alexa followed in 2014, and from 2015 to 2017 DeepMind’s AlphaGo shocked the world by utterly dominating the best human Go players.

All of this set the stage for the rise of LLMs just four short years later.

Where are we Now in the Evolution of Large Language Models?

Now that we’ve discussed this history, we’re well-placed to understand why LLMs and generative AI have ignited so much controversy. People have been mulling over the promise (and peril) of thinking machines for literally thousands of years, and it looks like they might finally be here.

But what, exactly, has people so excited? What is it that advanced AI tools are doing that has captured the popular imagination? In the following sections, we’ll talk about the astonishing (and astonishingly rapid) improvements seen in language models in recent memory.

Getting To Human-Level

One of the more surprising things about LLMs such as ChatGPT is just how good they are at so many different things. LLMs are trained by having them take samples of the text data they’re given, and then trying to predict what words come next given the words that came before.

Modern LLMs can do this incredibly well, but what is remarkable is just how far this gets you. People are using generative AI to help them write poems, business plans, and code, create recipes based on the ingredients in their fridges, and answer customer questions.

What is Emergence in Language Models?

Perhaps even more interesting, however, is the phenomenon of emergence in language models. When researchers tested LLMs on a wide variety of tasks meant to be especially challenging to these models – things like identifying a movie given a string of emojis or finding legal chess moves – they found that in about 5% of tasks, there is a sudden, sharp increase in ability on a given task once a model reaches a certain size.

At present, it’s not really clear how we should think about emergence. One hypothesis for emergence is that a big enough model is able to learn some general piece of knowledge not attainable by a smaller sibling, while another, more prosaic one is that it’s a relatively straightforward consequence of the model’s internal statistical machinery.

What’s more, it’s difficult to pin down the conditions required for emergence in language models. Though it generally appears to be a function of model size, there are cases in which the same abilities can be achieved with smaller models, or with models trained on very high-quality data, and emergence shows up at different scales for different models and tasks.

Whatever ends up being the case, it’s clear that this is a promising direction for future research. Much more work needs to be done to understand how precisely LLMs accomplish what they accomplish. This will not only redound upon the question of emergence, it will also inform the ongoing efforts to make language models safer and less biased.

LLM Agents

One of the bigger frontiers in LLM research is the creation of agents. ChatGPT and similar platforms can generate API calls and functioning code, but humans still need to copy and paste the code to actually do anything with it.

Agents are meant to get around this limitation. Auto-GPT, for example, pairs an underlying LLM with a “bot” that takes high-level tasks, breaks them down into tasks an LLM can solve, and stitches together those solutions.

This work is still in its infancy, but it continues to be very promising.

Multimodal Models

Another development worth mentioning is the rise of multi-modality. A model is “multi-modal” when it can process more than one kind of information, like images and text.

LLMs are staggeringly good at producing coherent language, and image models could do the same thing with images, but now a lot of time and effort is being spent on combining these two kinds of functionality.

The result has been models able to find specific sections of lengthy videos, generate images to accompany textual explanations, and create their own incredible videos from short, simple prompts.

It’s too early to tell what this will mean, but it’s already impacting branding, marketing, and related domains.

What’s Next For Large Language Models?

As with so many things, the meteoric rise of LLMs was presaged by decades of technical work and thousands of years of thought and speculation. In just a few short years, it has become the strategic centerpiece for contact centers the world over.

If you want to get in on the action, you could start by learning more about how Quiq builds customer-facing AI assistants using LLMs. This will provide the context you need to make the wisest decision about deploying this remarkable technology.