Leveraging Agent Insights to Boost Efficiency and Performance

In the ever-evolving customer service landscape, the role of contact center agents cannot be overstated. As the frontline representatives of a company, their performance directly impacts the quality of customer experience, influencing customer loyalty and brand reputation.

However, the traditional approach to managing agent performance – relying on periodic reviews and supervisor observations – has given way to a more sophisticated, data-driven strategy. For this reason, managing agent performance with a method that leverages the rich data generated by agent interactions to enhance service delivery, agent satisfaction, and operational efficiency is becoming more important all the time.

This article delves into this approach. We’ll begin by examining its benefits from three critical perspectives – the customer, the agent, and the contact center manager – before turning to a more granular breakdown of how you can leverage it in your contact center.

Why is it Important to Manage Agent Performance with Insights?

First, let’s start by justifying this project. While it’s true that very few people today would doubt the need to track some data related to what agents are doing all day, it’s still worth saying a few words about why it really is a crucial part of running a contact center.

To do this, we’ll focus on how three groups are impacted when agent performance is managed through insights: customers, the agents themselves, and contact center managers.

It’s Good for the Customers

The primary beneficiary of improved agent performance is the customer. Contact centers can tailor their service strategies by analyzing agent metrics to better meet customer needs and preferences. This data-driven approach allows for identifying common issues, customer pain points, and trends in customer behavior, enabling more personalized and effective interactions.

As agents become more adept at addressing customer needs swiftly and accurately, customer satisfaction levels rise. This enhances the individual customer’s experience and boosts the overall perception of the brand, fostering loyalty and encouraging positive word-of-mouth.

It’s Good for the Agents

Agents stand to gain immensely from a management strategy focused on data-driven insights. Firstly, performance feedback based on concrete metrics rather than subjective assessments leads to a fairer, more transparent work environment.

Agents receive specific, actionable feedback that helps them understand their strengths and which areas need improvement. This can be incredibly motivating and can drive them to begin proactively bolstering their skills.

Furthermore, insights from performance data can inform targeted training and development programs. For instance, if data indicates that an agent excels in handling certain inquiries but struggles with others, their manager can provide personalized training to bridge this gap. This helps agents grow professionally and increases their job satisfaction as they become more competent and confident in their roles.

It’s Good for Contact Center Managers

For those in charge of overseeing contact centers, managing agents through insights into their performance offers a powerful tool for cultivating operational excellence. It enables a more strategic approach to workforce management, where decisions are informed by data rather than gut feeling.

Managers can identify high performers and understand the behaviors that lead to success, allowing them to replicate these practices across the team. Intriguingly, this same mechanism is also at play in the efficiency boost seen by contact centers that adopt generative AI. When such centers train a model on the interactions of their best agents, the knowledge in those agents’ heads can be incorporated into the algorithm and utilized by much more junior agents.

The insights-driven approach also aids in resource allocation. By understanding the strengths and weaknesses of their team, managers can assign agents to the tasks they are most suited for, optimizing the center’s overall performance.

Additionally, insights into agent performance can highlight systemic issues or training gaps, providing managers with the opportunity to make structural changes that enhance efficiency and effectiveness.

Moreover, using agent insights for performance management supports a culture of continuous improvement. It encourages a feedback loop where agents are continually assessed, supported, and developed, driving the entire team towards higher performance standards. This improves the customer experience and contributes to a positive working environment where agents feel valued and empowered.

In summary, managing performance by tracking agent metrics is a holistic strategy that enhances the customer experience, supports agent development, and empowers managers to make informed decisions.

It fosters a culture of transparency, accountability, and continuous improvement, leading to operational excellence and elevated service standards in the contact center.

How to Use Agent Insights to Manage Performance

Now that we know what all the fuss is about, let’s turn to addressing our main question: how to use agent insights to correct, fine-tune, and optimize agent performance. This discussion will center specifically around Quiq’s Agent Insights tool, which is a best-in-class analytics offering that makes it easy to figure out what your agents are doing, where they could improve, and how that ultimately impacts the customers you serve.

Managing Agent Availability

To begin with, you need a way of understanding when your agents are free to handle an issue and when they’re preoccupied with other work. The three basic statuses an agent can have are “available,” “current conversations” (i.e. only working on the current batch of conversations), and “unavailable.” All three of these can be seen through Agent Insights, which allows you to select from over 50 different metrics, customizing and saving different views as you see fit.

The underlying metrics that go into understanding this dimension of agent performance are, of course, time-based. In essence, you want to evaluate the ratios between four quantities: the time the agent is available, the time the agent is online, the time the agent spends in a conversation, and the time an agent is unavailable.

As you’re no doubt aware, you don’t necessarily want to maximize the amount of time an agent spends in conversations, as this can quickly lead to burnout. Rather, you want to use these insights into agent performance to strike the best, most productive balance possible.

Managing Agent Workload

A related phenomenon you want to understand is the kind of workload your agents are operating under. The five metrics that underpin this are:

  1. Availability
  2. Number of completions per hour your agents are managing
  3. Overall utilization (i.e. the percentage of an agent’s available conversation limit they have filled in a given period).
  4. Average workload
  5. The amount of time agents spend waiting for a customer response.

All of this can be seen in Agent Insights. This view allows you to do many things to hone in on specific parts of your operation. You can sort by the amount of time your agents spend waiting for a reply from a customer, for example, or segment agents by e.g. role. If you’re seeing high waiting and low utilization, that means you are overstaffed and should probably have fewer agents.

If you’re seeing high waiting and high utilization, by contrast, you should make sure your agents know inactive conversations should be marked appropriately.

As with the previous section, you’re not necessarily looking to minimize availability or maximize completions per hour. You want to make sure that agents are working at a comfortable pace, and that they have time between issues to reflect on how they’re doing and think about whether they want to change anything in their approach.

But with proper data-driven insights, you can do much more to ensure your agents have the space they need to function optimally.

Managing Agent Efficiency

Speaking of functioning optimally, the last thing we want to examine is agent efficiency. By using Agent Insights, we can answer questions such as how well new agents are adjusting to their roles, how well your teams are working together, and how you can boost each agent’s output (without working them too hard).

The field of contact center analytics is large, but in the context of agent efficiency, you’ll want to examine metrics like completion rate, completions per hour, reopen rate, missed response rate, missed invitation rate, and any feedback customers have left after interacting with your agents.

This will give you an unprecedented peek into the moment-by-moment actions agents are taking, and furnish you with the hard facts you need to help them streamline their procedures. Imagine, for example, you’re seeing a lot of keyboard usage. This means the agent is probably not operating as efficiently as they could be, and you might be able to boost their numbers by training them to utilize Quiq’s Snippets tool.

Or, perhaps you’re seeing a remarkably high rate of clipboard usage. In that case, you’d want to look over the clipboard messages your agents are using and consider turning them into snippets, where they’d be available to everyone.

The Modern Approach to Managing Agents

Embracing agent insights for performance management marks a transformative step towards achieving operational excellence in contact centers. This data-driven approach not only elevates the customer service experience but also fosters a culture of continuous improvement and empowerment among agents.

By leveraging tools like Quiq’s Agent Insights, managers can unlock a comprehensive understanding of agent availability, workload, and efficiency, enabling informed decisions that benefit both the customer and the service team.

If you’re intrigued by the possibilities, contact us to schedule a demo today!

6 Questions to Ask Generative AI Vendors You’re Evaluating

With all the power exhibited by today’s large language models, many businesses are scrambling to leverage them in their offerings. Enterprises in a wide variety of domains – from contact centers to teams focused on writing custom software – are adding AI-backed functionality to make their users more productive and the customer experience better.

But, in the rush to avoid being the only organization not using the hot new technology, it’s easy to overlook certain basic sanity checks you must perform when choosing a vendor. Today, we’re going to fix that. This piece will focus on several of the broad categories of questions you should be asking potential generative AI providers as you evaluate all your options.

This knowledge will give you the best chance of finding a vendor that meets your requirements, will help you with integration, and will ultimately allow you to better serve your customers.

These are the Questions you Should ask Your Generative AI Vendor

Training large language models is difficult. Besides the fact that it requires an incredible amount of computing power, there are also hundreds of tiny little engineering optimizations that need to be made along the way. This is part of the reason why all the different language model vendors are different from one another.

Some have a longer context window, others write better code but struggle with subtle language-based tasks, etc. All of this needs to be factored into your final decision because it will impact how well your vendor performs for your particular use case.

In the sections that follow, we’ll walk you through some of the questions you should raise with each vendor. Most of these questions are designed to help you get a handle on how easy a given offering will be to use in your situation, and what integrating it will look like.

1. What Sort of Customer Service Do You Offer?

We’re contact center and customer support people, so we understand better than anyone how important it is to make sure users know what our product is, what it can do, and how we can help them if they run into issues.

As you speak with different generative AI vendors, you’ll want to probe them about their own customer support, and what steps they’ll take to help you utilize their platform effectively.

For this, just start with the basics by figuring out the availability of their support teams – what hours they operate in, whether they can accommodate teams working in multiple time zones, and whether there is an option for 24/7 support if a critical problem arises.

Then, you can begin drilling into specifics. One thing you’ll want to know about is the channels their support team operates through. They might set up a private Slack channel with you so you can access their engineers directly, for example, or they might prefer to work through email, a ticketing system, or a chat interface. When you’re discussing this topic, try to find out whether you’ll have a dedicated account manager to work with.

You’ll also want some context on the issue resolution process. If you have a lingering problem that’s not being resolved, how do you go about escalating it, and what’s the team’s response time for issues in general?

Finally, it’s important that the vendors have some kind of feedback mechanism. Just as you no doubt have a way for clients to let you know if they’re dissatisfied with an agent or a process, the vendor you choose should offer a way for you to let them know how they’re doing so they can improve. This not only tells you they care about getting better, it also indicates that they have a way of figuring out how to do so.

2. Does Your Team Offer Help with Setting up the Platform?

A related subject is the extent to which a given generative AI vendor will help you set up their platform in your environment. A good way to begin is by asking what kinds of training materials and resources they offer.

Many vendors are promoting their platforms by putting out a ton of educational content, all of which your internal engineers can use to get up to speed on what those platforms can do and how they function.

This is the kind of thing that is easy to overlook, but you should pay careful attention to it. Choosing a generative AI vendor that has excellent documentation, plenty of worked-out examples, etc. could end up saving you a tremendous amount of time, energy, and money down the line.

Then, you can get clarity on whether the vendor has a dedicated team devoted to helping customers like you get set up. These roles are usually found under titles like “solutions architect”, so be sure to ask whether you’ll be assigned such a person and the extent to which you can expect their help. Some platforms will go to the moon and back to make sure you have everything you need, while others will simply advise you if you get stuck somewhere.

Which path makes the most sense depends on your circumstances. If you have a lot of engineers you may not need more than a little advice here and there, but if you don’t, you’ll likely need more handholding (but will probably also have to pay extra for that). Keep all this in mind as you’re deciding.

3. What Kinds of Integrations Do You Support?

Now, it’s time to get into more technical details about the integrations they support. When you buy a subscription to a generative AI vendor, you are effectively buying a set of capabilities. But those capabilities are much more valuable if you know they’ll plug in seamlessly with your existing software, and they’re even more valuable if you know they’ll plug into software you plan on building later on. You’ve probably been working on a roadmap, and now’s the time to get it out.

It’s worth checking to see whether the vendor can support many different kinds of language models. This involves a nuance in what the word “vendor” means, so let’s unpack it a little bit. Some generative AI vendors are offering you a model, so they’re probably not going to support another company’s model.

OpenAI and Anthropic are examples of model vendors, so if you work with them you’re buying their model and will not be able to easily incorporate someone else’s model.

Other vendors, by contrast, are offering you a service, and in many cases that service could theoretically by powered by many different models.

Quiq’s Conversational CX Platform, for example, supports OpenAI’s GPT models, and we have plans to expand the scope of our integrations to encompass even more models in the future.

Another thing you’re going to want to check on is whether the vendor makes it easy to integrate vector databases into your workflow. Vectors are data structures that are remarkably good at capturing subtle relationships in large datasets; they’re becoming an ever-more-important part of machine learning, as evinced by the fact that there are now a multitude of different vector databases on offer.

The chances are pretty good that you’ll eventually want to leverage a vector database to store or search over customer interactions, and you’ll want a vendor that makes this easy.

Finally, see if the vendor has any case studies you can look at. Quiq has published a case study on how our language services were utilized by LOOP, a car insurance company, to make a far superior customer-service chatbot. The result was that customers were able to get much more personalization in their answers and were able to resolve their problems fully half of the time, without help. This led to a corresponding 55% reduction in tickets, and a customer satisfaction rating of 75% (!) when interacting with the Quiq-powered AI assistant.

See if the vendors you’re looking at have anything similar you can examine. This is especially helpful if the case studies are focused on companies that are similar to yours.

4. How Does Prompt Engineering and Fine-Tuning Work for Your Model?

For many tasks, large language models work perfectly fine on their own, without much special effort. But there are two methods you should know about to really get the most out of them: prompt engineering and fine-tuning.

As you know, prompts are the basic method for interacting with language models. You’ll give a model a prompt like “What is generative AI?”, and it’ll generate a response. Well, it turns out that models are really sensitive to the wording and structure of prompts, and prompt engineers are those who explore the best way to craft prompts to get useful output from a model.

It’s worth asking potential vendors about this because they handle prompts differently. Quiq’s AI Studio encourages atomic prompting, where a single prompt has a clear purpose and intended completion, and we support running prompts in parallel and sequentially. You can’t assume everyone will do this, however, so be sure to check.

Then, there’s fine-tuning, which refers to training a model on a bespoke dataset such that its output is heavily geared towards the patterns found in that dataset. It’s becoming more common to fine-tune a foundational model for specific use cases, especially when those use cases involve a lot of specialized vocabulary such as is found in medicine or law.

Setting up a fine-tuning pipeline can be cumbersome or relatively straightforward depending on the vendor, so see what each vendor offers in this regard. It’s also worth asking whether they offer technical support for this aspect of working with the models.

5. Can Your Models Support Reasoning and Acting?

One of the current frontiers in generative AI is building more robust, “agentic” models that can execute strings of tasks on their way to completing a broader goal. This goes by a few different names, but one that has been popping up in the research literature is “ReAct”, which stands for “reasoning and acting”.

You can get ReAct functionality out of existing language models through chain-of-thought prompting, or by using systems like AutoGPT; to help you concretize this a bit, let’s walk through how ReAct works in Quiq.

With Quiq’s AI Studio, a conversational data model is used to classify and store both custom and standard data elements, and these data elements can be set within and across “user turns”. A single user turn is the time between when a user offers an input to the time at which the AI responds and waits for the next user input.

Our AI can set and reason about the state of the data model, applying rules to take the next best action. Common actions include things like fetching data, running another prompt, delivering a message, or offering to escalate to a human.

Though these efforts are still early, this is absolutely the direction the field is taking. If you want to be prepared for what’s coming without the need to overhaul your generative AI stack later on, ask about how different vendors support ReAct.

6. What’s your Pricing Structure Like?

Finally, you’ll need to talk to vendors about how their prices work, including any available details on licensing types, subscriptions, and costs associated with the integration, use, and maintenance of their solution.

To take one example, Quiq’s licensing is based on usage. We establish a usage pool wherein our customers pre-pay Quiq for a 12-month contract; then, as the customer uses our software money is deducted from that pool. We also have an annual AI Assistant Maintenance fee along with a one-time implementation fee.

Vendors can vary considerably in how their prices work, so if you don’t want to overpay then make sure you have a clear understanding of their approach.

Picking the Right Generative AI Vendor

Language models and related technologies are taking the world by storm, transforming many industries, including customer service and contact center management.

Making use of these systems means choosing a good vendor, and that requires you to understand each vendor’s model, how those models integrate with other tools, and what you’re ultimately going to end up paying.

If you want to see how Quiq stacks up and what we can do for you, schedule a demo with us today!

Building Better Customer Relationships with Text Messaging

Customer engagement is constantly evolving and the trend towards more customer-centric experiences hasn’t slowed. Businesses are increasingly having to provide faster, easier, and more friendly ways of initiating and responding to customer’s inquiries.

Businesses that adapt to this continually changing environment will ensure they deliver superior service along with desirable products, thus boosting engagement rates.

This is where customer engagement strategies based on text messaging enter the picture. This mode of communication has overtaken traditional methods, like phone and email, as consumers prefer the ease, convenience, and hassle-free nature of text messaging.

Texting isn’t just for friends and family anymore and consumers are choosing this channel more often as it fits their on-the-go lifestyle.

The move to text messaging is a part of this new era of building customer relationships, and both businesses and consumers can benefit.

The old customer engagement marketing strategies are fading

As recently as two decades ago, the world of business and customer service was a completely different place. Company agents and representatives used forms of customer engagement like trade shows, promotional emails, letters, and phone calls to promote their products and services.

While these methods are still used in a wide range of industries, many companies today are turning to new ways of maintaining customer loyalty.

According to the Pew Research Center, about 96% of Americans own a cell phone of some kind. Text messaging is a highly popular form of communication in people’s everyday lives. As such, it only seems natural that companies would use texting as a service, sales, and marketing tool. Their results have been astounding, and that’s what we’ll explore in the next section.

The advantages of digital customer engagement strategies

While sending text messages to customers may be a new frontier for many companies, businesses are finding the personal, casual nature of this medium is part of what makes it so effective.

Some of the benefits that come with text-based customer service include:

Hassle-free customer service access

Consumers love instant messaging because it’s easy and allows them to engage, ask questions, and get information without having to make a phone call or meet face-to-face.

One of the hallmarks of our increasingly digital world is how hard businesses work to make things easy – think of 1-click shopping on Amazon (you don’t have to click two buttons), how smartphones enable contactless payment (you don’t have to pull your card out), the way Alexa responds to voice commands (you don’t have to click anything), and the way Netflix automatically plays the next episode of a show you’re binging (you don’t even have to move).

These expectations are becoming more ingrained in the minds of consumers, especially young ones, and they are unlikely to be enthusiastic about needing to call an agent or go into the store to resolve any problems they have.

Timely responses and service

Few things turn a customer off faster than sending an email or making a phone call, then having to wait days for a response. With text message customer service, you can stay connected 24/7 and provide timely responses and solutions. Artificial intelligence is one customer engagement technology that will make this even easier in the years ahead (more on this below).

The personal touch

Customers are more likely to stick around if they believe you care about their personal needs. Texting will allow you to take a more individualized approach, communicating with customers in the same way they might communicate with friends. This stands in contrast to the stiffer, more formal sorts of interactions that tend to happen over the phone or in person.

A dynamic variety of solutions

Text messaging provides unique opportunities for marketing, sales, and customer support. For example, you might use texting to help troubleshoot a product, promote new sales, send coupons, and more.

None of these things are impossible to do with older approaches to customer service but think of how pain-free it would be for a busy single mom to ask a question, check the reply when she stops to pick up her daughter from school, ask another question, check the new reply when she gets home, etc. This is vastly easier than finding a way to carve three hours out of the day to go into the store to speak to an agent directly.

To make these ideas easier to digest, here is a table summarizing the ground we’ve just covered:

The Old Way The New Way
Method of Delivery Phone calls, pamphlets, trade shows, face-to-face conversations Text messaging
Difficulty Requires spending time on the phone, driving to a physical location, or making an appointment. Only requires a phone and the ability to text on it.
Timeliness Can take hours or days to get a reply. Replies should be almost instantaneous.
Personalization Good agents might be able to personalize the interaction, but it’s more difficult.  Personalizing messages and meeting a customer on their own terms because natural and easy.
Variety Does offer ways of solving problems or upselling customers, but only at the cost of more effort from the agent.  Sales and customer support can be embedded seamlessly in existing conversations, and those conversations fit better into a busy modern lifestyle.

​​Why this all matters

These benefits matter because 64% of Americans would rather receive a text than a phone call. It’s clear what the consumers want, and it’s the business’s job to deliver.

Because text messaging can help you engage with customers on a more personal level, it can increase customer loyalty, lead to more conversions, and in general boost engagement rates.

What’s more, text-based customer relationships will likely be transformed by the advent of generative artificial intelligence, especially large language models (LLMs). This technology will make it so easy to offer 24/7 availability that everyone will take it for granted, to say nothing of how it can personalize replies based on customer-specific data, translate between languages, answer questions in different levels of detail, etc.

Texting already provides agents with the ability to manage multiple customers at a time, but they’ll be able to accommodate far higher volumes when they’re working alongside machines, boosting efficiency and saving huge amounts of time.

Some day soon, businesses will look back on the days when human beings had to do all of this with a sense of gratitude for how technology has streamlined the process of delivering a top-shelf customer experience.

And it is exactly this customer satisfaction that’ll allow those businesses to increase profits and make room for business growth over time.

Request a demo from Quiq today

In the future, as in the past, customer service will change with the rise of new technologies and strategies. If you don’t want to be left behind, contact Quiq today for a demo.

We not only make it easy to integrate text messaging into your broader approach to building customer relationships, we also have bleeding-edge language models that will allow you to automate substantial parts of your workflow.

Request A Demo

What is Automated Customer Service? – Ultimate Guide

From graph databases to automated machine learning pipelines and beyond, a lot of attention gets paid to new technologies. But the truth is, none of it matters if users aren’t able to handle the more mundane tasks of managing permissions, resolving mysterious errors, and getting the tools installed and working on their native systems.

This is where customer service comes in. Though they don’t often get the credit they deserve, customer service agents are the ones who are responsible for showing up every day to help countless others actually use the latest and greatest technology.

Like every job since the beginning of jobs, there are large components of customer service that have been automated, are currently being automated, or will be automated at some point soon.

That’s our focus for today. We want to explore customer service as a discipline, and then talk about some of how generative AI can automate substantial parts of the standard workflow.

What is Customer Service?

To begin with, we’ll try to clarify what customer service is and why it matters. This will inform our later discussion of automated customer service, and help us think through the value that can be added through automation.

Customer service is more or less what it sounds like: serving your customers – your users, or clients – as they go about the process of utilizing your product. A software company might employ customer service agents to help onboard new users and troubleshoot failures in their product, while a services’ company might use them for canceling appointments and rescheduling.

Over the prior few decades, customer service has evolved alongside many other industries. As mobile phones have become firmly ensconced in everyone’s life, for example, it has become more common for businesses to supplement the traditional avenues of phone calls and emails by adding text messaging and chatbot customer support to their customer service toolkit. This is part of what is known as an omni-channel strategy, in which more effort is made to meet customers where they’re at rather than expecting them to conform to the communication pathways a business already has in place.

Naturally, many of these kinds of interactions can be automated, especially with the rise of tools like large language models. We’ll have more to say about that shortly.

Why is Customer Service Important?

It may be tempting for those writing the code to think that customer service is a “nice to have”, but that’s not the case at all. However good a product’s documentation is, there will simply always be weird behaviors and edge cases in which a skilled customer service agent (perhaps helped along with AI) needs to step in and aid a user in getting everything running properly.

But there are other advantages as well. Besides simply getting a product to function, customer service agents contribute to a company’s overall brand, and the general emotional response users have to the company and its offerings.

High-quality customer service agents can do a lot to contribute to the impression that a company is considerate, and genuinely cares about its users.

What Are Examples of Good Customer Service?

There are many ways in which customer service agents can do this. For example, it helps a lot when customer service agents try to transmit a kind of warmth over the line.

Because so many people spend their days interacting with others through screens, it can be easy to forget what that’s like, as tone of voice and facial expression are hard to digitally convey. But when customer service agents greet a person enthusiastically and go beyond “How may I help you” by exchanging some opening pleasantries, they feel more valued and more at ease. This matters a lot when they’ve been banging their head against a software problem for half a day.

Customer service agents have also adapted to the digital age by utilizing emojis, exclamation points, and various other kinds of internet-speak. We live in a more casual age, and under most circumstances, it’s appropriate to drop the stiffness and formalities when helping someone with a product issue.

That said, you should also remember that you’re talking to customers, and you should be polite. Use words like “please” when asking for something, and don’t forget to add a “thank you.” It can be difficult to remember this when you’re dealing with a customer who is simply being rude, especially when you’ve had several such customers in a row. Nevertheless, it’s part of the job.

Finally, always remember that a customer gets in touch with you when they’re having a problem, and above all else, your job is to get them what they need. From the perspective of contact center managers, this means you need periodic testing or retraining to make sure your agents know the product thoroughly.

It’s reasonable to expect that agents will sometimes need to look up the answer to a question, but if they’re doing that constantly it will not only increase the time it takes to resolve an issue, it will also contribute to customer frustration and a general sense that you don’t have things well in hand.

Automation in Customer Service

Now that we’ve covered what customer service is, why it matters, and how to do it well, we have the context we need to turn to the topic of automated customer service.

For all intents and purposes, “automation” simply refers to outsourcing all or some of a task to a machine. In industries like manufacturing and agriculture, automation has been steadily increasing for hundreds of years.

Until fairly recently, however, the technology didn’t yet exist to automate substantial portions of customer service worth. With the rise of machine learning, and especially large language models like ChatGPT, that’s begun to change dramatically.

Let’s dive into this in more detail.

Examples of Automated Customer Service

There are many ways in which customer service is being automated. Here are a few examples:

  • Automated questions answering – Many questions are fairly prosaic (“How do I reset my password”), and can effectively be outsourced to a properly finetuned large language model. When such a model is trained on a company’s documentation, it’s often powerful enough to handle these kinds of low-level requests.
  • Summarization – There have long been models that could do an adequate job of summarization, but large language models have kicked this functionality into high gear. With an endless stream of new emails, Slack messages, etc. constantly being generated, having an agent that can summarize their contents and keep agents in the loop will do a lot to boost their productivity.
  • Classifying incoming messages – Classification is another thing that models have been able to do for a while, and it’s also something that helps a lot. Having an agent manually sort through different messages to figure out how to prioritize them and where they should go is no longer a good use of time, as algorithms are now good enough to do a major chunk of this kind of work.
  • Translation – One of the first useful things anyone attempted to do with machine learning was translating between different natural languages (i.e. from Russian into English). Once squarely in the purview of human beings, this is now a task that machines can do almost as well, at least for customer service work.

Should We Automate Customer Service?

All this having been said, you may still have questions about the wisdom of automating customer service work. Sure, no one wants to spend hours every day looking up words in Mandarin to answer a question or prioritizing tickets by hand, but aren’t we in danger of losing something important as customer service agents? Might we not automate ourselves out of a job?

No one can predict the future, of course, but the early evidence is quite to the contrary. Economists have conducted studies of how contact centers have changed with the introduction of generative AI, and their findings are very encouraging.

Because these models are (usually) finetuned on conversations from more experienced agents, they’re able to capture a lot of how those agents handle issues. Typical response patterns, politeness, etc. become “baked into” the models. Junior agents using these models are able to climb the learning curve more quickly and, feeling less strained in their new roles, are less likely to quit. This, in turn, puts less of a burden on managers and makes the organization overall more stable. Everyone ends up happier and more productive.

So far, it’s looking like AI-based automation in contact centers will be like automation almost everywhere else: machines will gradually remove the need for human attention in tedious or otherwise low-value tasks, freeing them up to focus on places where they have more of an advantage.

If agents don’t have to sort tickets anymore or resolve routine issues, they can spend more time working on the really thorny problems, and do so with more care.

Moving Quiq-ly into the Future!

Where the rubber of technology meets the road of real-world use cases, customer service agents are extremely important. They not only make sure customers can use a company’s tools, but they also contribute to the company brand in their tone, mannerisms, and helpfulness.

Like most other professions, customer service agents are being impacted by automation. So far, this impact has been overwhelmingly positive and is likely to prove a competitive advantage in the decades ahead.

If you’re intrigued by this possibility, Quiq has created a suite of industry-leading conversational AI tools, both for customer-facing applications and agent-facing applications. Check them out or schedule a demo with us to see what all the fuss is about.

Request A Demo

Top 5 Benefits of AI for Hospitality

As an industry, hospitality is aimed squarely at meeting customer needs. Whether it’s a businesswoman staying in 5-star resorts or a mother of three getting a quiet weekend to herself, the job of the hospitality professionals they interact with is to anticipate what they want and make sure they get it.

As technologies like artificial intelligence become more powerful and pervasive, customer expectations will change. When that businesswoman books a hotel room, she’ll expect there to be a capable virtual assistant talking to her about a vacation spot; when that mother navigates the process of buying a ticket, she’ll expect to be interacting with a very high-quality chatbot, perhaps one that’s indistinguishable from an actual human being.

All of this means that the hospitality industry needs to be thinking about how it will be impacted by AI. It needs to consider what the benefits of AI for hospitality are, what limitations are faced by AI, and how it can be utilized effectively. That’s what we’re here to do today, so let’s get started.

Why is AI Important for Hospitality?

AI is important in hospitality for the same reason it’s important everywhere else: it’s poised to become a transformative technology, and just about every industry – especially those that involve a lot of time interacting through text – could be up-ended by it.

The businesses that emerge the strongest from this ongoing revolution will be those that successfully anticipate how large language models and similar tools change workflows, company setups, cost and pricing structures, etc.

With that in mind, let’s work through some of the ways in which AI is (or will) be used in hospitality.

How is AI Used in Hospitality?

There are many ways in which AI is used in hospitality, and in the sections that follow we’ll walk through a number of the most important ones.

Chatbots and Customer Service

Perhaps the most obvious place to begin is with chatbots and customer service more broadly. Customer-facing chatbots were an early application of natural language processing, and have gotten much better in the decades since. With ChatGPT and similar LLMs, they’re currently in the process of taking another major leap forward.

Now that we have models that can be fine-tuned to answer questions, summarize texts, and carry out open-ended interactions with human users, we expect to see them becoming more and more common in hospitality. Someday soon, it may be the case that most of the steps involved in booking a room or changing a flight happens entirely without human assistance of any kind.

This is especially compelling because we’ve gotten so good at making chatbots that are very deferential and polite (though as we make clear in the final section on “limitations”, this is not always the case.)

Virtual Assistants

AI virtual assistants are a generalization of the idea behind chatbots. Whereas chatbots can be trained to offload many parts of hospitality work, powerful virtual assistants will take this dynamic to the next level. Once we have better agents – systems able to take strings of actions in pursuit of a goal – many more parts of hospitality work will be outsourced to the machines.

What might this look like?

Well, we’ve already seen some tools that can do relatively simple tasks like “book a flight to Indonesia”, but they’re still not all that flexible. Imagine an AI virtual assistant able to handle all the subtleties and details involved in a task like “book a flight for ten executives to Indonesia, and book lodging near the conference center and near the water, too, then make reservations for a meal each night of the week, taking into account the following dietary restrictions.”

Work into building generative agents like this is still in its infancy, but it is nevertheless an active area of research. It’s hard to predict when we’ll have agents who can be trusted to do advanced work with minimal oversight, but once we do, it’ll really begin to change how the hospitality industry runs.

Sentiment Analysis

Sentiment analysis refers to an automated, algorithmic approach to classifying the overall vibe of a piece of text. “The food was great” is obviously positive sentiment, “the food was awful” is obviously negative sentiment, and then there are many subtler cases involving e.g. sarcasm.

The hospitality industry desperately needs tools able to perform sentiment analysis at scale. It helps them understand what clients like and dislike about particular services or locations, and can even help in predicting future demand. If, for example, there’s a bunch of positive sentiment around a concert being given in Indonesia, that indicates that there will probably be a spike in bookings there.

Boosting Revenues for Hospitality

People have long been interested in using AI to make money, whether that be from trading strategies generated by ChatGPT or from using AI to create ultra-targeted marketing campaigns.

All of this presents an enormous opportunity for the hospitality industry. Through a combination of predictive modeling, customer segmentation, sentiment analysis, and related techniques, it’ll become easier to forecast changes in demand, create much more responsive pricing models, and intelligently track inventory.

What this will ultimately mean is better revenues for hotels, event centers, and similar venues. You’ll be able to cross-sell or upsell based on a given client’s unique purchase history and interests, you’ll have fewer rooms go unoccupied, and you’ll be less likely to have clients who are dissatisfied by the fact tha you ran out of something.

Sustainability and Waste Management

An underappreciated way in which AI will benefit hospitality is by making sustainability easier. There are a few ways this could manifest.

One is by increasing energy efficiency. Most of you will already be familiar with currently-existing smart room technology, like thermostats that learn when you’re leaving and turn themselves up, thus lowering your power bill.

But there’s room for this to become much more far-ranging and powerful. If AI is put in charge of managing the HVAC system for an entire building, for example, it could lead to savings on the order of millions of dollars, while simultaneously making customers more comfortable during their stay.

And the same holds true for waste management. AI systems smart enough to discover when a trash can is full means that your cleaning staff won’t have to spend nearly as much time patrolling. They’ll be able to wait until they get a notification to handle the problem, gaining back many hours in their day that can be put towards higher-value work.

What are the Limitations of AI in Hospitality?

None of this is to suggest that there won’t also be drawbacks to using AI in hospitality. To prepare you for these challenges, we’ll spend the next few sections discussing how AI can fail, allowing you to be proactive in mitigating these downsides.

Impersonality in Customer Service

By properly fine-tuning a large language model, it’s possible to get text output that is remarkably polite and conversational. Still, throughout repeated or sustained interactions, the model can come to feel vaguely sterile.

Though it might in principle be hard to tell when you’re interacting with AI v.s. a human, the fact remains that models don’t actually have any empathy. They may say “I’m sorry that you had to deal with that…”, but they won’t truly know what frustration is like, and over time, a human is likely to begin picking up on that.

We can’t say for certain when models will be capable of expressing sympathy in a fully convincing way, but for the time being, you should probably incorporate systems that can flag conversations that are going off the rails so that a human customer service professional can intervene.

Toxic Output, Bias, and Abuse

As in the previous section, a lot of work has gone into finetuning models so that they don’t produce toxic, biased, or abusive language. Still, not all the kinks have been ironed out, and if a question is phrased in just the right way, it’s often possible to get past these safeguards. That means your models might unpredictably become insulting or snarky, which is a problem for a hospitality company.

As we’ve argued elsewhere, careful monitoring is one of the prices that have to be paid when managing an AI assistant. Since this technology is so new, we have at best a very vague idea of what kinds of prompts lead to what kinds of responses. So, you’ll simply have to diligently keep your eyes peeled for examples of model responses that are inappropriate, having a human take over if and when things are going poorly.

(Or, you can work with Quiq – our guardrails ensure none of this is a problem for enterprise hospitality businesses).

AI in Hospitality

New technologies have always changed the way industries operate, and that’s true for hospitality as well. From virtual assistants to chatbots to ultra-efficient waste management, AI offers many benefits (and many challenges) for hospitality.

If you want to explore using these tools in your hospitality enterprise but don’t know the first thing about hiring AI engineers, check out the Quiq conversational CX platform. We’ve built a proprietary large language model offering that makes it easy to incorporate chatbots and other technologies, without having to worry about what’s going on under the hood.

Schedule a demo with us today to find out how you can catch the AI wave!

Request A Demo

4 Benefits of Using AI Assistants in the Retail Industry

Artificial intelligence (AI) has been making remarkable strides in recent months. Owing to the release of ChatGPT in November of 2022, a huge amount of attention has been on large language models, but the truth is, there have been similar breakthroughs in computer vision, reinforcement learning, robotics, and many other fields.

In this piece, we’re going to focus on how these advances might contribute specifically to the retail sector.

We’ll start with a broader overview of AI, then turn to how AI-based tools are making it easier to make targeted advertisements, personalized offers, hiring decisions, and other parts of retail substantially easier.

What are AI assistants in Retail?

Artificial intelligence is famously difficult to define precisely, but for our purposes, you can think of it as any attempt to get intelligent behavior from a machine. This could involve something relatively straightforward, like building a linear regression model to predict future demand for a product line, or something far more complex, like creating neural networks able to quickly spit out multiple ideas for a logo design based on a verbal description.

AI assistants are a little different and specifically require building agents capable of carrying out sequences of actions in the service of a goal. The field of AI is some 70 years old now and has been making remarkable strides over the past decade, but building robust agents remains a major challenge.

It’s anyone’s guess as to when we’ll have the kinds of agents that could successfully execute an order like “run this e-commerce store for me”, but there’s nevertheless been enough work for us to make a few comments about the state of the art.

What are the Ways of Building AI Assistants?

On its own, a model like ChatGPT can (sometimes) generate working code and (often) generate correct API calls. But as things stand, a human being still needs to utilize this code for it to do anything useful.

Efforts are underway to remedy this situation by making models able to use external tools. Auto-GPT, for example, combines an LLM and a separate bot that repeatedly queries it. Together, they can take high-level tasks and break them down into smaller, achievable steps, checking off each as it works toward achieving the overall objective.

AssistGPT and SuperAGI are similar endeavors, but they’re better able to handle “multimodal” tasks, i.e those that also involve manipulating images or sounds rather than just text.

The above is a fairly cursory examination of building AI agents, but it’s not difficult to see how the retail establishments of the future might use agents. You can imagine agents that track inventory and re-order crucial items when they get low, or that keep an eye on sales figures and create reports based on their findings (perhaps even using voice synthesis to actually deliver those reports), or creating customized marketing campaigns, generating their own text, images, and A/B tests to find the highest-performing strategies.

What are the Advantages of Using AI in Retail Business?

Now that we’ve talked a little bit about how AI and AI assistants can be used in retail, let’s spend some time talking about why you might want to do this in the first place. What, in other words, are the big advantages of using AI in retail?

1. Personalized Marketing with AI

People can’t buy your products if they don’t know what you’re selling, which is why marketing is such a big part of retail. For its part, marketing has long been a future-oriented business, interested in leveraging the latest research from psychology or economics on how people make buying decisions.

A kind of holy grail for marketing is making ultra-precise, bespoke marketing efforts that target specific individuals. The kind of messaging that would speak to a childless lawyer in a big city won’t resonate the same way with a suburban mother of five, and vice versa.

The problem, of course, is that there’s just no good way at present to do this at scale. Even if you had everything you needed to craft the ideal copy for both the lawyer and the mother, it’s exceedingly difficult to have human beings do this work and make sure it ends up in front of the appropriate audience.

AI could, in theory, remedy this situation. With the rise of social media, it has become possible to gather stupendous amounts of information about people, grouping them into precise and fine-grained market segments–and, with platforms like Facebook Ads, you can make really target advertisements for each of these segments.

AI can help with the initial analysis of this data, i.e. looking at how people in different occupations or parts of the country differ in their buying patterns. But with advanced prompt engineering and better LLMs, it could also help in actually writing the copy that induces people to buy your products or services.

And it doesn’t require much imagination to see how AI assistants could take over quite a lot of this process. Much of the required information is already available, meaning that an agent would “just” need to be able to build simple models of different customer segments, and then put together a prompt that generates text that speaks to each segment.

2. Personalized Offerings with AI

A related but distinct possibility is using AI assistants to create bespoke offerings. As with messaging, people will respond to different package deals; if you know how to put one together for each potential customer, there could be billions in profits waiting for you. Companies like Starbucks have been moving towards personalized offerings for a while, but AI will make it much easier for other retailers to jump on this trend.

We’ll illustrate how this might work with a fictional example. Let’s say you’re running a toy company, and you’re looking at data for Angela and Bob. Angela is an occasional customer, mostly making purchases around the holidays. When she created her account she indicated that she doesn’t have children, so you figure she’s probably buying toys for a niece or nephew. She’s not a great target for a personalized offer, unless perhaps it’s a generic 35% discount around Christmas time.

Bob, on the other hand, buys fresh trainsets from you on an almost weekly basis. He more than likely has a son or daughter who’s fascinated by toy machines, and you have customer-recommendation algorithms trained on many purchases indicating that parents who buy the trains also tend to buy certain Lego sets. So, next time Bob visits your site, your AI assistant can offer him a personalized discount on Lego sets.

Maybe he bites this time, maybe he doesn’t, but you can see how being able to dynamically create offerings like this would help you move inventory and boost individual customer satisfaction a great deal. AI can’t yet totally replace humans in this kind of process, but it can go a long way toward reducing the friction involved.

3. Smarter Pricing

The scenario we just walked through is part of a broader phenomenon of smart pricing. In economics, there’s a concept known as “price discrimination”, which involves charging a person roughly what they’re willing to pay for an item. There may be people who are interested in buying your book for $20, for example, but others who are only willing to pay $15 for it. If you had a way of changing the price to match what a potential buyer was willing to pay for it, you could make a lot more money (assuming that you’re always charging a price that at least covers printing and shipping costs).

The issue, of course, is that it’s very difficult to know what people will pay for something–but with more data and smarter AI tools, we can get closer. This will have the effect of simultaneously increasing your market (by bringing in people who weren’t quite willing to make a purchase at a higher price) and increasing your earnings (by facilitating many sales that otherwise wouldn’t have taken place).

More or less the same abilities will also help with inventory more generally. If you sell clothing you probably have a clearance rack for items that are out of season, but how much should you discount these items? Some people might be fine paying almost full price, while others might need to see a “60%” off sticker before moving forward. With AI, it’ll soon be possible to adjust such discounts in real-time to make sure you’re always doing brisk business.

4. AI and Smart Hiring

One place where AI has been making real inroads is in hiring. It seems like we can’t listen to any major podcast today without hearing about some hiring company that makes extensive use of natural language processing and similar tools to find the best employees for a given position.

Our prediction is that this trend will only continue. As AI becomes increasingly capable, eventually it will be better than any but the best hiring managers at picking out talent; retail establishments, therefore, will rely on it more and more to put together their sales force, design and engineering teams, etc.

Is it Worth Using AI in Retail?

Throughout this piece, we’ve sung the praises of AI in retail. But the truth is, there are still questions about how much sense it makes to leverage retail at the moment, given its expense and risks.

In this section, we’ll briefly go over some of the challenges of using AI in retail so you can have a fuller picture of how its advantages compare to its disadvantages, and thereby make a better decision for your situation.

The one that’s on everyone’s minds these days is the tendency of even powerful systems like ChatGPT to hallucinate incorrect information or to generate output that is biased or harmful. Finetuning and techniques like retrieval augmented generation can mitigate this somewhat, but you’ll still have to spend a lot of time monitoring and tinkering with the models to make sure that you don’t end up with a PR disaster on your hands.

Another major factor is the expense involved. Training a model on your own can cost millions of dollars, but even just hiring a team to manage an open-source model will likely set you back a fair bit (engineers aren’t cheap).

By far the safest and easiest way of testing out AI for retail is by using a white glove solution like the Quiq conversational CX platform. You can test out our customer-facing and agent-facing AI tools while leaving the technical details to us, and at far less expense than would be involved in hiring engineering talent.

Set up a demo with us to see what we can do for you.

AI is Changing Retail

From computer-generated imagery to futuristic AI-based marketing plans, retail won’t be the same with the advent of AI. This will be especially true once we have robust AI assistants able to answer customer questions, help them find clothes that fit, and offer precision discounts and offerings tailored to each individual shopper.

If you don’t want to get left behind, you’ll need to begin exploring AI as soon as possible, and we can help you do that. Check out our product or find a time to talk with us, today!

AI in Retail: 5 Ways Retailers Are Using AI Assistants

Businesses have always turned to the latest and greatest technology to better serve their customers, and retail is no different. From early credit card payment systems to the latest in online advertising, retailers know that they need to take advantage of new tools to boost their profits and keep shoppers happy.

These days, the thing that’s on everyone’s mind is artificial intelligence (AI). AI has had many, many definitions over the years, but in this article, we’ll mainly focus on the machine-learning and deep-learning systems that have captured the popular imagination. These include large language models, recommendation engines, basic AI assistants, etc.

In the world of AI in retail, you can broadly think of these systems as falling into one of two categories: “the ones that customers see”, and “the ones that customers don’t see.” In the former category, you’ll find innovations such as customer-facing chatbots and algorithms that offer hyper-personalized options based on shopping history. In the latter, you’ll find precision fraud detection systems and finely-tuned inventory management platforms, among other things.

We’ll cover each of these categories, in order. By the end of this piece, you’ll have a much better understanding of the ways retailers are using AI assistants and will be better able to think about how you want to use this technology in your retail establishment.

Let’s get going!

Using AI Assistants for Better Customer Experience

First, let’s start with AI that interacts directly with customers. The major ways in which AI is transforming the customer experience are through extreme levels of personalization, more “humanized” algorithms, and shopping assistants.

Personalization in Shopping and Recommendations

One of the most obvious ways of improving the customer experience is by tailoring that experience to each individual shopper. There’s just one problem: this is really difficult to do.

On the one hand, most of your customers will be new to you, people about whom you have very little information and whose preferences you have no good way of discovering. On the other, there are the basic limitations of your inventory. If you’re a brick-and-mortar establishment you have a set number of items you can display, and it’s going to be pretty difficult for you to choose them in a way that speaks to each new customer on a personal level.

For a number of reasons, AI has been changing this state of affairs for a while now, and holds the potential to change it much more in the years ahead.

A key part of this trend is recommendation engines, which have gotten very good over the past decade or so. If you’ve ever been surprised by YouTube’s ability to auto-generate a playlist that you really enjoyed, you’ve seen this in action.

Recommendation engines can only work well when there is a great deal of customer data for them to draw on. As more and more of our interactions, shopping, and general existence have begun to take place online, there has arisen a vast treasure trove of data to be analyzed. In some situations, recommendation engines can utilize decades of shopping experience, public comments, reviews, etc. in making their recommendations, which means a far more personalized shopping experience and an overall better customer experience.

What’s more, advances in AR and VR are making it possible to personalize even more of these experiences. There are platforms now that allow you to upload images of your home to see how different pieces of furniture will look, or to see how clothes fit you without the need to try them on first.

We expect that this will continue, especially when combined with smarter printing technology. Imagine getting a 3D-printed sofa made especially to fit in that tricky corner of your living room, or flipping through a physical magazine with advertisements that are tailored to each individual reader.

Humanizing the Machines

Next, we’ll talk about various techniques for making the algorithms and AI assistants we interact with more convincingly human. Admittedly, this isn’t terribly important at the present moment. But as more of our shopping and online activity comes to be mediated by AI, it’ll be important for them to sound empathic, supportive, and attuned to our emotions.

The two big ways this is being pursued at the moment are chatbots and voice AI.

Chatbots, of course, will probably be familiar to you already. ChatGPT is inarguably the most famous example, but you’ve no doubt interacted with many (much simpler) chatbots via online retailers or contact centers.

In the ancient past, chatbots were largely “rule-based”, meaning they were far less flexible and far less capable of passing as human. With the ascendancy of the deep learning paradigm, however, we now have chatbots that are able to tutor you in chemistry, translate between dozens of languages, help you write code, answer questions about company policies, and even file simple tickets for contact center agents.

Naturally, this same flexibility also means that retail managers must tread lightly. Chatbots are known to confidently hallucinate incorrect information, to become abusive, or to “help” people with malicious projects, like building weapons or computer viruses.

Even leaving aside the technical challenges of implementing a chatbot, you have to carefully monitor your chatbots to make sure they’re performing as expected.

Then, there’s voice-based AI. Computers have been synthesizing speech for many years, but it hasn’t been until recently that they’ve become really good at it. Though you can usually tell that a computer is speaking if you listen very carefully, it’s getting harder and harder all the time. We predict that, in the not-too-distant future, you’ll simply have no idea whether it’s a human or a machine on the other end of the line when you call to return an item or get store hours.

But computers have also gotten much better at the other side of voice-based AI, speech recognition. Software like otter.ai, for example, is astonishingly accurate when generating transcriptions of podcast episodes or conversations, even when unusual words are used.

Taken together, advances in both speech synthesis and speech recognition paint a very compelling picture of how the future of retail might unfold. You can imagine walking into a Barnes & Noble in the year 2035 and having a direct conversation with a smart speaker or AI assistant. You’ll tell it what books you’ve enjoyed in the past, it’ll query its recommendation system to find other books you might like, and it’ll speak to you in a voice that sounds just like a human’s.

You’ll be able to ask detailed questions about the different books’ content, and it’ll be able to provide summaries, discuss details with you, and engage in an unscripted, open-ended conversation. It’ll also learn more about you over time, so that eventually it’ll be as though you have a friend that you go shopping with whenever you buy new books, clothing, etc.

Shopping Assistants and AI Agents

So far, we’ve confined our conversation specifically to technologies like large language models and conversational AI. But one thing we haven’t spent much time on yet is the possibility of creating agents in the future.

An agent is a goal-directed entity, one able to take an instruction like “Make me a reservation at an Italian restaurant” and decompose the goal into discrete steps, performing each one until the task is completed.

With clever enough prompt engineering, you can sort of get agent-y behavior out of ChatGPT, but the truth is, the work of building advanced AI agents has only just begun. Tools like AutoGPT and LangChain have made a lot of progress, but we’re still a ways away from having agents able to reliably do complex tasks.

It’s not hard to see how different retail will be when that day arrives, however. Eventually, you may be outsourcing a lot of your shopping to AI assistants, who will make sure the office has all the pens it needs, you’ve got new science fiction to read, and you’re wearing the latest fashion. Your assistant might generate new patterns for t-shirts and have them custom-printed; if LLMs get good enough, they’ll be able to generate whole books and movies tuned to your specific tastes.

Using AI Assistants to Run A Safer, Leaner Operation

Now that we’ve covered the ways AI assistants will impact the things customers can see, let’s talk about how they’ll change the things customers don’t see.

There are lots of moving parts in running a retail establishment. If you’ve got ~1,000 items on display in the front, there are probably several thousand more items in a warehouse somewhere, and all of that has to be tracked. What’s more, there’s a constant process of replenishing your supply, staying on top of new trends, etc.

All of this will also be transformed by AI, and in the following sections, we’ll talk about a few ways in which this could happen.

Fraud Detection and Prevention

Fraud, unfortunately, is a huge part of modern life. There’s an entire industry of people buying and selling personal information for nefarious purposes, and it’s the responsibility of anyone trafficking in that information to put safeguards in place.

That includes a large number of retail establishments, which might keep data related to a customer’s purchases, their preferences, and (of course) their actual account and credit card numbers.

This isn’t the place to get into a protracted discussion of cybersecurity, but much of fraud detection relies on AI, so it’s fair game. Fraud detection techniques range from the fairly basic (flagging transactions that are much larger than usual or happen in an unusual geographic area) to the incredibly complex (training powerful reinforcement learning agents that constantly monitor network traffic).

As AI becomes more advanced, so will fraud detection. It’ll become progressively more difficult for criminals to steal data, and the world will be safer as a result. Of course, some of these techniques are also ones that can be used by the bad guys to defraud people, but that’s why so much effort is going into putting guardrails on new AI models.

Streamlining Inventory

Inventory management is an obvious place for optimization. Correctly forecasting what you’ll need and thereby reducing waste can have a huge impact on your bottom line, which is why there are complex branches of mathematics aimed at modeling these domains.

And – as you may have guessed – AI can help. With machine learning, extremely accurate forecasts can be made of future inventory requirements, and once better AI agents have been built, they may even be able to automate the process of ordering replacement materials.

Forward-looking retail managers will need to keep an eye on this space to fully utilize its potential.

AI Assistants and the Future of Retail

AI is changing a great many things. It’s already making contact center agents more effective and is being utilized by a wide variety of professionals, ranging from copywriters to computer programmers.

But the space is daunting, and there’s so much to learn about implementing, monitoring, and finetuning AI assistants that it’s hard to know where to start. One way to easily dip your toe in these deep waters is with the Quiq Conversational CX platform.

Our technology makes it easy to create customer-facing AI bots and similar tooling, which will allow you to see how AI can figure into your retail enterprise without hiring engineers and worrying about the technical details.

Schedule a demo with us today to get started!

Request A Demo

How Scoped AI Ensures Safety in Customer Service

AI chat applications powered by Large Language Models (LLMs) have helped us reimagine what is possible in a new generation of AI computing.

Along with this excitement, there is also a fair share of concern and fear about the potential risks. Recent media coverage, such as this article from the New York Times, highlights how the safety measures of ChatGPT can be circumvented to produce harmful information.

To better understand the security risks of LLMs in customer service, it’s important we add some context and differentiate between “Broad AI” versus “Scoped AI”. In this article, we’ll discuss some of the tactics used to safely deploy scoped AI assistants in a customer service context.

Broad AI vs. Scoped AI: Understanding the Distinction

Scoped AI is designed to excel in a well-defined domain, guided and limited by a software layer that maintains its behavior within pre-set boundaries. This is in contrast to broad AI, which is designed to perform a wide range of tasks across virtually all domains.

Scoped AI and Broad AI answer questions fundamentally differently. With Scoped AI the LLM is not used to determine the answer, it is used to compose a response from the resources given to it. Conversely, answers to questions in Broad AI are determined by the LLM and cannot be verified.

Broad AI simply takes a user message and generates a response from the LLM; there is no control layer outside of the LLM itself. Scoped AI is a software layer that applies many steps to control the interaction and enforce safety measures applicable to your company.

In the following sections, we’ll dig into a more detailed explanation of the steps.

Ensuring the Safety of Scoped AI in Customer Service

1. Inbound Message Filtering

Your AI should perform a semantic similarity search to recognize in-scope vs out-of-scope messages from a customer. Malicious characters and known prompt injections should be identified and rejected with a static response. Inbound message filtering is an important step in limiting the surface area to the messages expected from your customers.

2. Classifying Scope

LLMs possess strong Natural Language Understanding and Reasoning skills (NLU & NLR). An AI assistant should perform a number of classifications. Common classifications include the topic, user type, sentiment, and sensitivity of the message. These classifications should be specific to your company and the jobs of your AI assistant. A data model and rules engine should be used to apply your safety controls.

3. Resource Integration

Once an inbound message is determined to be in-scope, company-approved resources should be retrieved for the LLM to consult. Common resources include knowledge articles, brand facts, product catalogs, buying guides, user-specific data, or defined conversational flows and steps.

Your AI assistant should support non-LLM-based interactions to securely authenticate the end user or access sensitive resources. Authenticating users and validating data are important safety measures in many conversational flows.

4. Verifying Responses

With a response in hand, the AI should verify the answer is in scope and on brand. Fact-checking and corroboration techniques should be used to ensure the information is derived from the resource material. An outbound message should never be delivered to a customer if it cannot be verified by the context your AI has on hand.

5. Outbound Message Filtering

Outbound message filtering tactics include: conducting prompt leakage analysis, semantic similarity checks, consulting keyword blacklists, and ensuring all links and contact information are in-scope of your company.

6. Safety Monitoring and Analysis

Deploying AI safely also requires that you have mechanisms to capture and retrospect on the completed conversations. Collecting user feedback, tracking resource usage, reviewing state changes, and clustering conversations should be available to help you identify and reinforce the safety measures of your AI.

In addition, performing full conversation classifications will also allow you to identify emerging topics, confirm resolution rates, produce safety reports, and understand the knowledge gaps of your AI.

Other Resources

At Quiq, we actively monitor and endorse the OWASP Top 10 for Large Language Model Applications. This guide is provided to help promote secure and reliable AI practices when working with LLMs. We recommend companies exploring LLMs and evaluating AI safety consult this list to help navigate their projects.

Final Thoughts

By safely leveraging LLM technology through a Scoped AI software layer, CX leaders can:

1. Elevate Customer Experience
2. Boost Operational Efficiency
3. Enhance Decision Making
4. Ensure Consistency and Compliance

Reach out to sales@quiq.com to learn how Quiq is helping companies improve customer satisfaction and drive efficiency at the same.

The Pros and Cons of Using ChatGPT: Agents vs. Customers

If you’re a contact center manager who has been impressed with ChatGPT and everything it makes possible, a natural follow-up question is where you should deploy it.

On the one hand, you could use it internally to make your contact center agents more efficient. They’d be able to ask questions of your company documentation, summarize important emails, outsource the more trivial parts of their workload, and plenty besides.

On the other hand, you could use it externally as a customer-facing application. If you had clients that were confused about a feature or needed help figuring something out, ChatGPT could go a long way towards resolving their issues with minimal attention from your contact center agents.

Of course, there is major overlap in both these options, but there are crucial differences as well. In this article, we’ll discuss the pros and cons of using ChatGPT or a similar large language model (LLM) for contact center agents v.s. using it for customers.

How is ChatGPT Making Contact Center Agents More Efficient?

To a first approximation, a contact center is a place where questions are answered. No matter how clear your instructions or comprehensive your documentation, there will inevitably be users who simply can’t get an issue resolved, and that’s when they’ll reach out to customer support.

This means that much of a contact center agent’s day-to-day revolves around interacting with clients via text, either over a chat interface or possibly through text messaging.

What’s more, much of this interaction will be relatively formulaic. Customers will be repeatedly asking about similar sorts of issues, or there’ll be asking questions that are covered somewhere in your product’s documentation.

If you’ve spent even five minutes with ChatGPT, it’s probably occurred to you that it’s a powerful tool for handling exactly these kinds of tasks. Let’s spend a few minutes digging into this idea.

Outsourcing Routine Tasks

The most obvious way that ChatGPT is making contact center agents more efficient is by allowing them to outsource some of this more routine work.

There are a few ways this can happen. First, ChatGPT can help with answering basic questions. Today, large language models are not particularly good at generating highly original and inventive text, but when it comes to churning out helpful, simple boilerplate, they’re without peer.

This means that, with a little training or fine-tuning, your contact center agents can use ChatGPT to answer the sorts of questions they see multiple times a day, such as where a given feature is located or how to handle a common error. This will free them up to focus on the more involved queries, for which they have a comparative advantage.

In this same vein, tools like ChatGPT can also help contact center agents adopt the appropriate, polite tone in their correspondences. Customer experience and customer service are major parts of being a contact center agent, which means replies must be crafted so as to put the customer (who may be frustrated, angry, and belligerent) at ease.

This is something ChatGPT excels at, and according to the paper “Generative AI at work”, this exact dynamic was responsible for a lot of the gain in productivity seen in a contact center that began using an LLM. The model was trained on the interactions of more seasoned agents who know how to deal with tricky customers, and a good portion of this ability was transferred to more junior agents via the model’s output.

Another place where ChatGPT can help is in writing documentation. This may fall to a technical writer rather than an actual agent, but in either case, ChatGPT’s remarkable ability to provide outlines and quickly generate expository text can speed up the process of documenting your product’s core features.

And finally, ChatGPT is quite good at writing and explaining simple code. As with documentation, it’s doubtful that a contact center agent is going to be spending much time writing code. Nevertheless, your agents might find themselves hit with questions from savvier users about e.g. API integrations, so they should know that they can query ChatGPT about what a code snippet is doing, and they can have it generate a basic code example if they need to.

Learning and Brainstorming

This is a bit more abstract, but ChatGPT has proven remarkably useful in brainstorming study plans, solutions to problems, etc. Though the algorithm itself isn’t particularly creative, when it generates ideas that a human being can riff off of the combination of algorithm + human can be much more creative than a human working by herself.

While there will be many situations in which a contact center agent has a script to work off of, when they don’t, turning to ChatGPT can be the spark that moves them forward.

ChatGPT Plugins for Contact Center Agents

One of the more exciting developments for ChatGPT was the release of its plugin library in March of 2023. There are now plugins from Instacart (for food delivery), Expedia (for trip planning), Klarna Shopping (for online retail), and many others.

Truthfully, most of this won’t (yet) be of much use for contact center agents, but it’s worth mentioning given how quickly people are developing new plugins. If you’re a contact center agent or manager wanting to extend the functionality of powerful LLM technologies, plugins are something you’ll want to be aware of.

Getting the Most out of ChatGPT for Customer Service

ChatGPT is remarkably good for a wide range of tasks, but to really leverage its full capacities you’ll need to be aware of a few common terms.

Large language models are known to be really sensitive to small changes in word choice and structure, which means there’s an art to phrasing your requests just so. This is known as “prompt engineering” a language model, and it’s a new discipline that can be enormously valuable if done correctly.

You can also get better results if you show ChatGPT an example or two of what you’re looking for. This is known as “one-shot” learning (if you show it one example), and “few-shot” learning (if you show it five or six).

Of course, if that doesn’t work you can instead try to fine-tune a large language model. This involves gathering hundreds of examples of the conversations, text, or output you want to see and feeding them all to the model (probably over its API) so that the model’s internal structure actually changes. Though it’s obviously a more significant engineering challenge, it will probably give you the best results of all.

ChatGPT v.s. Chatbots

We in the customer experience field have quite a lot of experience with chatbots, so it’s natural to wonder how ChatGPT is different.

Chatbots are just algorithms that are capable of carrying on a dialogue with customers, and this can be accomplished in many different ways. Some chatbots are extremely simple and follow a rules-based approach to formatting their responses, while those based on neural networks or some other advanced machine-learning technology are much more flexible.

Chatbots can be built with ChatGPT, but most aren’t.

How is ChatGPT Changing Customer Experience?

Now that we’ve covered some of the ways in which ChatGPT is helping customer service agents, let’s discuss some of the ways it’s actually helping customers.

Personalized Responses

One property of ChatGPT that makes it extremely effective is that it’s able to remember the context. When you chat with ChatGPT, it’s not generating each new response in a vacuum, it’s producing them either on the basis of what has already been said or based on information that it’s been given.

This means that if you have a customer interacting with a chat interface powered by a LLM (and are being smart by guardrailing it with a conversational CX platform like Quiq), they’ll be able to have more open-ended and personalized interactions with the tool than would be possible with simpler chatbots.

This will go a long way toward making them feel like they’re being taken care of, thus boosting your company’s overall customer satisfaction.

Automatically Resolving Customer Issues

Earlier, we talked about how contact center agents would be able to leverage ChatGPT in order to outsource their more routine tasks.

Well, one of those routine tasks is resolving a steady stream of quotidian issues. How many times a day do you think a contact center agent has to help a person log in to their client’s software or reset a password? It’s probably not “hundreds”, but we’d bet that it’s a lot.

ChatGPT is a long way away from being able to patiently guide a user through any arbitrary problem they might have, but it’s already more than capable of handling the kinds simple of repetitive, basic queries that sap an agent’s energy.

Automatic Natural Language Translation

One of the surprising places where ChatGPT excels is in fast, accurate translation between multiple languages. Given the fact that English is so commonly used in the technical community, it can be easy to lose sight of the fact that billions of people have either no knowledge of it or, at best, a very rudimentary grasp.

But not many can afford to have all their documentation translated into dozens of different tongues or to keep a team of translators on staff. ChatGPT is almost certainly not going to capture every little nuance in a translation, but it should be sufficient to help a person resolve their issue on their own or to ask more pointed, technical questions.

Dangers in Using ChatGPT

Whether you end up letting your agents or your customers get ahold of ChatGPT first, you should know that it’s not a panacea, nor is it perfect. It can and will fail, and some of those failures are reasonably predictable ones you should be prepared for.

The most obvious and well-known failure is referred to as a “hallucination”, and it results from the way that LLMs like ChatGPT are trained. An LLM learns how to output sequences of tokens, it’s not doing any fact-checking on its own. That means it will cheerfully and confidently make up names, book titles, and URLs.

It’s also possible for ChatGPT to become obnoxious and insulting. The team at OpenAI has done a good job of tuning this behavior out, but recall that these systems are very sensitive to the way prompts are structured, and it can reemerge.

There’s no general solution to these issues as far as we know. You can assiduously construct a fine-tuning pipeline for LLMs that does even more to get rid of toxicity, but ultimately you’re going to have to monitor ChatGPT’s output to see if it’s straying or otherwise being unhelpful.

Quiq specializes in defining guardrails for enterprise businesses who want to harness ChatGPT’s benefits, but are brand protective.

Figuring Out Where to Deploy ChatGPT

Whether it makes more sense to use ChatGPT internally or externally will depend a lot on your circumstances. There’s a lot ChatGPT can do to make your contact center agents more efficient, but if you’re just wanting to offload basic customer queries they can certainly be useful for that purpose.

In our considered opinion, the ROI is ultimately higher for using ChatGPT in a customer-facing way. This will allow your clients to help themselves, ultimately boosting their satisfaction and their estimation of your product.

But whichever way you choose to go, you can substantially reduce the headache associated with managing the infrastructure for this complex technology by making use of the Quiq conversational CX platform. With us, you can get world-leading results, satisfy your customer, lighten the load on your agents, and never have to worry about a rogue answer,  compute cluster, or GPU.

Current Large Language Models and How They Compare

From ChatGPT and Bard to BLOOM and Claude, there is now a veritable ocean of different large language models (LLMs) for you to choose from. Some of them are specialized for specific use cases, some are open-source, and there’s a huge variance in the number of parameters they contain.

If you find yourself fascinated by this technology and interested in using it in your contact center, it can be hard to know how to choose the right tool for the job.

Today, we’re going to tackle this issue head-on. After a brief discussion of the history of LLMs, we’ll talk about specific criteria you can use to evaluate LLMs, sources of additional information, and some of the better-known options.

Let’s get going!

A Brief History of Generative AI

Though it may feel like LLMs and generative AI exploded onto the scene all of half a year ago, in fact, the basic research powering these advances goes back much further.

Way back in the 1940s, Walter Pitts and Warren McCulloch drew upon early research on the brain to design artificial neurons. Though these worked, they couldn’t be deployed for anything particularly useful until the invention of the backpropagation algorithm in 1985. This allowed larger neural networks to be trained effectively, and in 1989 Yann LeCun built a convolutional system able to identify handwritten numbers.

Around this same time, there were architectural discoveries like long short-term memory networks that made it possible for machine learning algorithms to learn far more complex relationships within data, laying the foundations for them to eventually be able to revolutionize work in places like contact centers.

What’s more, the opening decade of the 2000s marked the beginning of the big data era. For all their power, generative pre-trained models like ChatGPT are not terribly efficient learners. To be able to output language or images, they must be shown many, many examples from which to derive the statistical function that allows them to create surprising new output later.

Once researchers began the practice of publishing enormous datasets a key obstacle to building large, useful systems was removed. When combined with the preceding six decades of foundational conceptual work, this was enough to allow us here in 2023 to witness the birth of generative AI and large language models.

How to Compare Large Language Models?

If you’re shopping around for a large language model for a particular application, it makes sense to first get clear on the evaluation criteria you should be using. That’s what we’ll cover in the sections below.

Evaluating LLMs Based on Industry

One of the more remarkable aspects of ChatGPT is that it’s so good at so many things. Out of the box (or sometimes with a little fine-tuning) it can perform very well at answering questions, summarizing text, translating between natural languages, and much more.

However, there may well be situations in which you’d want to use a domain-specific LLM, one that has been trained on medical or legal text, financial data, etc. The basic process of training a generative model is now being used to build neural networks for material design, protein synthesis, and music, among other things.

So, if you’re considering using a generative pre-trained model in your business, one thing you might want to think about early on is whether you want to try to find a domain-specific model, or a general model that you train on your own data.

If you do look for a domain-specific model, be aware that the space is very new and there might not be one available yet (though given how much attention is going into generative AI right now, there’s also a decent chance that one will be released in relatively short order).

Alternatively, you could try to fine-tune a pre-trained model. Getting into the nuances of fine-tuning, zero-shot learning, few-shot learning, and prompt engineering is beyond the scope of this article, but suffice it to say that there are many ways for you to get a generic LLM to be better at a smaller range of specific tasks.

If you’re an engineer designing circuits for quantum computers this might not be sufficient, but for those of us working in customer experience and contact centers, a well-honed prompt or a half-dozen examples might be more than enough for substantial performance boosts.

Evaluating LLMs By Language

Given that English is a sort of lingua franca (should it be lingua anglica?) for the tech community and makes up nearly 60% of the websites on the internet, it’s no surprise that it also comprises the bulk of the training data going into modern LLMs.

ChatGPT and other systems are often pretty good at multi-lingual tasks by default, but they don’t perform equally well in all languages. As you can probably guess, they’re best at “high-resource” languages (English, Spanish, Chinese), somewhat worse at “medium-resource” languages (Portuguese, Hindi), and much worse at “low-resource” languages (Haitian and Swahili).

If you’re serving customers with a medium- or low-resource language and need really high levels of accuracy, you’ll probably have to stick with human beings for a while. Otherwise, test ChatGPT or whatever system you end up going with for how well it can handle multi-lingual problems like question answering and translation.

Whether They’re Open-Source or Closed-Source

No doubt you’ve heard of “open-source” software, a term which refers to the practice of releasing source code to the public where it can be forked, modified, and scrutinized.

The open-source approach to software development has become incredibly popular, and this enthusiasm has partially bled over into artificial intelligence and machine learning. It’s is now fairly common to open source datasets, models, and even training frameworks like TensorFlow.

How does this translate to the realm of large language models? In truth, it’s a bit of a mixture. Some models are proudly open-sourced, while others jealously guard their model’s weights, training data, and source code.

This is one thing you might want to consider as you carry out your search for a large-language model. Some of the very best models, like ChatGPT, are closed-source. You won’t be able to fork the ChatGPT code base and modify it, you’ll be relegated to feeding queries into it via an API.

The advantage to going with a closed-source model, of course, is that you needn’t lay awake at night worrying about managing a codebase thousands of lines long, nor will you need to concern yourself with hiring the expensive engineers who know how to read and use it.

The downside, naturally, is that you’re entirely beholden to the team who builds and offers the LLM over their API. If they make updates or go bankrupt, you could be left scrambling last-minute to find an alternative solution.

There’s no one-size-fits-all approach here; if you have the in-house technical expertise to fork an open-source LLM and you want to modify it, open-source is probably the way to go. But be aware that this is a substantial commitment, and as things stand today, the very best generative pre-trained language models are closed-source, so there’s a performance penalty that you’ll have to account for.

Contact Us
 

Leaderboards and Comparison Websites for Large Language Models

Another route you can go in comparing current LLMs is to avail yourself of a service build for this purpose.

Whatever rumors you may have heard, programmers are human beings, and human beings have a fondness for ranking and categorizing pretty much everything – sports teams, guitar solos, classic video games, you name it.

Naturally, as LLMs have become better-known, leaderboards and websites have popped up comparing them along all sorts of different dimensions. Here are a few you can use as you search around for the best tooling.

Leaderboards

In the past couple of months, leaderboards have emerged which directly compare various LLMs.

One is AlpacaEval, which uses a custom dataset to compare ChatGPT, Claude, Cohere, and other LLMs on how well they’re able to follow instructions. AlpacaEval boasts high agreement with human evaluators, so in our estimation it’s probably a suitable way of initially screening for LLM tools, though more extensive checks might be required to settle on a final list.

Another good choice is Chatbot Arena, which pits two anonymous models side-by-side, has you rank which one is better, than aggregates all the scores into a leaderboard.

Finally, there is Hugging Face’s Open LLM Leaderboard, which is a similar endeavor. Anyone can submit a new model for evaluation, all of which are then assessed based on a small set of key benchmarks from the Eleuther AI Language Model Evaluation Harness. These capture how well the models do in answering simple science questions, common-sense queries, and more.

When combined with the criteria we discussed earlier, these leaderboards and comparison websites ought to give you everything you need to find a powerful generative pre-trained language model for your application.

What are the Currently-Available Large Language Models?

Okay! Now that we’ve worked through all this background material, let’s turn to discussing some of the major LLMs that are available today. We make no promises about these entries being comprehensive (and even if they were, there’d be new models out next week), but it should be sufficient to give you an idea as to the range of options you have.

ChatGPT and GPT

Obviously, the titan in the field is OpenAI’s ChatGPT, which is really just a version of GPT that has been fine-tuned through reinforcement learning from human feedback to be especially good at sustained dialogue.

ChatGPT and GPT have been used in many domains, including customer service, question answering, and many others.

LLaMA

In February of 2023, Facebook’s AI team released its Large Language Model Meta AI, or LLaMA. At 65 billion parameters it is not quite as big as GPT, and this is intentional, as it’s purpose is to aid researchers who may not have the budget or expertise required to provision a behemoth LLM.

LaMDA

Like GPT-4, Google’s LaMDA is based on the transformer architecture and is aimed squarely at dialogue. It is able to converse on a nearly infinite number of subjects, and from the beginning, the Google team has focused on having LaMDA produce interesting responses that are nevertheless absent of abuse and harmful language.

MT-NLG

The Megatron-Turing Natural Language Generation (MT-NLG) model from Nvidia sports a staggering half-trillion (530 billion) parameters, and excels at “…Completion prediction, Reading comprehension, Commonsense reasoning, Natural language inferences, Word sense disambiguation,” and more.

StableLM

StableLM is a lightweight, open-source language model built by Stability AI. It’s trained on a new dataset called “The Pile”, which is itself made up of over 20 smaller, high-quality datasets which together amount to over 825 GB of natural language.

GPT4All

What would you get if you trained an LLM on “…on a massive curated corpus of assistant interactions, which included word problems, multi-turn dialogue, code, poems, songs, and stories,” then released it to on an Apache 2.0 license? The answer is GPT4All, an open-source model whose purpose is to encourage research into what these technologies can accomplish.

Alpaca

The Alpaca LLM project developed by Stanford is designed around following instructions. As things stand, Alpaca isn’t considered safe yet, so it is intended to be used by research teams exploring the frontiers of LLMs.

BLOOM

The BigScience Large Open-Science Open-Access Multilingual Language Model (BLOOM) was released in late 2022. The team that put it together consisted of more than a thousand researchers from all over the worlds, and unlike the other models on this list, it’s specifically meant to be interpretable.

GATO

DeepMind is one of the leading players advancing the frontiers of AI, and their GATO LLM is correspondingly remarkable. Like GPT-4, GATO is multimodal, meaning it can work with text, images, games, and can even control a robot.

Pathways Language Model (PaLM)

Like LaMDA, PaLM is from Google, and is also enormous (540 billion parameters). It excels in many language-related tasks, and became famous when it produced really high-level explanations of tricky jokes.

Claude

Anthropic’s Claude is billed as a “next-generation AI assistant.” It’s not known how big the model is, but it does come in two modes: the full Claude, and Claude Instant, which is faster but produces lower-quality responses.

FAQs

Now, let’s turn to some common sources of confusion where comparing current LLMs are concerned.

Overcoming the Limitations of Large Language Models

Large language models are remarkable tools, but they nevertheless suffer from some well-known limitations. They tend to hallucinate facts, for example, sometimes fail at basic arithmetic, and can get lost in the course of lengthy conversations.

Overcoming the limitations of large language models is mostly a matter of fine-tuning and monitoring them. The fine-tuning data you use must be carefully curated in order to cover basic failure modes, and you must have a robust means of checking on their output in case they go off the rails somewhere along the line.

What are the Best Large Language Models?

Having read all of the foregoing content, it’s natural to wonder if there’s a single model that best suits your enterprise. The answer is probably “yes”, but which model is ultimately the best fit for you depends a lot on the specifics. You’ll have to think about whether you want an open-source model or your content with hitting an API, whether your use case is outside the scope of ChatGPT and better handled with a bespoke model, etc.

Choosing Among the Current Large Language Models

With all the different LLMs on offer, it’s hard to narrow the search down to the one that’s best for you. By carefully weighing the different metrics we’ve discussed in this article, you can choose an LLM that meets your needs with as little hassle as possible.

Another way to minimize your headaches is to use an industry-leading solution that works out of the box to deliver world-class functionality. That’s exactly what we’re achieving here at Quiq. Schedule a demo to see how our conversational AI platform can help you build a forward-facing contact center.

Contact Center Managers: What Do LLMs Mean For You?

Whether it’s quantum computing, the blockchain, or generative AI, whenever a promising new technology emerges, forward-thinking people begin looking for a way to use it.

And this is a completely healthy response. It’s through innovation that the world moves forward, but great ideas don’t mean much if there aren’t people like contact center managers who use them to take their operations to the next level.

Today, we’re going to talk about what large language models (LLMs) like ChatGPT mean for contact centers. After briefly reviewing how LLMs work we’ll discuss the way they’re being used in contact centers, how those centers are changing as a result, and some things that contact center managers need to look out for when utilizing generative AI.

What are Large Language Models?

As their name suggests, LLMs are large, they’re focused on language, and they’re machine-learning models.

It’s our view that the best way to tackle these three properties is in reverse order, so we’ll start with the fact that LLMs are enormous neural networks trained via self-supervised learning. These neural networks effectively learn a staggeringly complex function that captures the statistical properties of human language well enough for them to generate their own.

Speaking of human language, LLMs like ChatGPT are pre-trained generative models focused on learning from and creating text. This distinguishes them from other kinds of generative AI, which might be focused on images, videos, speech, music, and proteins (yes, really.)

Finally, LLMs are really big. As with other terms like “big data” no one has a hard-and-fast rule for figuring out when you’ve gone from “language model” to “large language model” – but with billions of internal parameters, it’s safe to say that an LLM is a few orders of magnitude bigger than anything you’re likely to build outside of a world-class engineering team.

How can Large Language Models be Used in Contact Centers?

Since they’re so good at parsing and creating natural language, LLMs are an obvious choice for enterprises where there’s a lot of back-and-forth text exchanged, perhaps while, say, resolving issues or answering questions.

And for this reason, LLMs are already being used by contact center managers to make their agents more productive (more on this shortly).

To be more concrete, we turned up a few specific places where LLMs can be leveraged by contact center managers most effectively.

Answering questions: Even with world-class documentation, there will inevitably be customers who are having an issue they want help with. Though ChatGPT won’t be able to answer every such question, it can handle a lot of them, especially if you’ve fine-tuned it on your documentation.

Streamlining onboarding: For more or less the same reason, ChatGPT can help you onboard new hires. Employees learning the ropes will also be confused about parts of your technology and your process, and ChatGPT can help them find what they need more quickly.

Summarizing emails and articles: It might be possible for a team of five to be intimately familiar with what everyone else is doing, but any more than this and there will inevitably be things happening that are beyond their purview. By summarizing articles, tickets, email or Slack threads, etc., ChatGPT can help everyone stay reasonably up-to-date without having to devote hours every day to reading.

Issue prioritization: Not every customer question or complaint is equally important, and issues have to be prioritized before being handed off to contact center agents. ChatGPT can aid in this process, especially if it’s part of a broader machine-learning pipeline built for this kind of classification.

Translation: If you’re lucky enough to have a global audience, there will almost certainly be users who don’t have a firm grasp of English. Though there are tools like Google Translate that do a great job of handling translation tasks, ChatGPT often does an even better job.

What are Large Language Models for Customer Service?

Large language models are ideally suited for tasks that involve a great deal of working with text. Because contact center agents spend so much time answering questions and resolving customer issues, LLMs are a technology that can make them far more productive. ChatGPT excels at tasks like question answering, summarization, and language translation, which is why they’re already changing the way contact centers function.

How is Generative AI Changing Contact Centers?

The fear that advances in AI will lead to a decrease in employment among inferior human workers has a long and storied pedigree. Still, thus far the march of technological progress has tended to increase the number (and remuneration) of available jobs on the market.

Far from rendering human analysts obsolete, personal computers are now a major and growing source of new work (though, we confess, much less of it is happening on typewriters than before.)

Nevertheless, once people got a look at what ChatGPT can do there arose a fresh surge of worry over whether, this time, the robots were finally going to take all of our jobs.

Wanting to know how generative pre-trained language models have actually impacted the functioning of contact centers, Erik Brynjolfsson, Danielle Li, and Lindsey R. Raymond looked at data from some 5,000 customer support agents using it in their day-to-day work.

Their paper, “Generative AI at Work”, found that generative AI had led to a marked increase in productivity, especially among the newest, least-knowledgable, and lowest-performing workers.

The authors advanced the remarkable hypothesis that this might stem from the fact that LLMs are good at internalizing and disseminating the hard-won tacit knowledge of the best workers. They didn’t get much out of generative AI, in other words, precisely because they already had what they needed to perform well; but some fraction of their skill – such as how to phrase responses delicately to avoid offending irate customers – was incorporated into the LLM, where it was more accessible by less-skilled workers than it was when it was locked away in the brains of high-skilled workers.

What’s more, the organizations studied also changed as a result. Employees (especially lower-skilled ones) were generally more satisfied, less prone to burnout, and less likely to leave. Turnover was reduced, and customers escalated calls to supervisors less frequently.

Now, we hasten to add that of course this is just one study, and we’re in the early days of the generative AI revolution. No one can say with certainty what the long-term impact will be. Still, these are extremely promising early results, and lend credence to the view that generative AI will do a lot to improve the way contact centers onboard new hires, resolve customer issues, and function overall.

What are the Dangers of Using ChatGPT for Customer Service?

We’ve been singing the praises of ChatGPT and talking about all the ways in which it’s helping contact center managers run a tighter ship.

But, as with every technological advance stretching clear back to the discovery of fire, there are downsides. To help you better use generative AI, we’ll spend the next few sections talking about some characteristic failure modes you should be looking out for.

Hallucinations

By now, it’s fairly common knowledge that ChatGPT will just make things up. This is a consequence of the way LLMs like ChatGPT are trained. Remember, the model doesn’t contain a little person inside of it that’s checking statements for accuracy; it’s just taking the tokens it has seen so far and predicting the tokens that will come next.

That means if you ask it for a list of book recommendations to study lepidoptery or the naval battles of the Civil War (we don’t know what you’re into), there’s a pretty good chance that the list it provides will contain a mix of real and fake books.

ChatGPT has been known to invent facts, people, papers (complete with citations), URLs, and plenty else.

If you’re going to have customers interacting with it, or you’re going to have your contact center agents relying on it in a substantial way, this is something you’ll need to be aware of.

Degraded Performance

ChatGPT is remarkably performant, but it’s still just a machine learning model and machine learning models are known to suffer from model degradation.

This term refers to gradual or precipitous declines in model performance over time. There are technical reasons why this occurs, but from your perspective, you need to understand that the work has only begun once a model has been trained and put into production.

But you’re also not out of the woods if you’re accessing ChatGPT via an API, because you have just as little visibility into what’s happening on OpenAI’s engineering teams as the rest of us do.

If OpenAI releases an update you might suddenly find that ChatGPT fails in usual ways or trips over tasks it was handling very well last week. You’ll need to have robust monitoring in place so that you catch these issues if they arise, as well as an engineering team able to address the root cause.

Model degradation often stems from issues with the underlying data. This means that if you’ve e.g. trained ChatGPT to answer questions you might have to assemble new data for it to train on, a process that takes time and money and should be budgeted for.

Harassment and Bias

You could argue that harassment, bias, and harmful language are a kind of degraded performance, but they’re distinct and damaging enough to warrant their own section.

When Microsoft first released Sydney it was cartoonishly unhinged. It would lie, threaten, and manipulate users; in one case, it confessed both its love for a New York Times reporter along with its desire to engineer dangerous viruses and ignite internecine arguments between people.

All this has gotten much better, of course, but the same behavior can manifest in subtler ways, especially if someone is deliberately trying to jailbreak a large language model.

Thanks to extensive public testing and iteration, the current versions of the technology are very good at remaining polite, avoiding stereotyping, etc. Nevertheless, we’re not aware of any way to positively assure that no bias, deceit, or nastiness will emerge from ChatGPT.

This is another place where you’ll have to carefully monitor your model’s output and make corrections as necessary.

Using LLMs in your Contact Center

If you’re running a contact center, you owe it to yourself to at least check out ChatGPT. Whether it makes sense for you will depend on your unique circumstances, but it’s a remarkable new technology that could help you make your agents more effective while reducing turnover.

Quiq offers a white-glove platform that makes it easy to leverage conversational AI. Schedule a demo with us to see how we can help you incorporate generative AI into your contact center today!

Ways to Use ChatGPT for Customer Service

Now that we’ve all seen what ChatGPT can do, it’s natural to begin casting about for ways to put it to work. An obvious place where a generative AI language model can be used is in contact centers, which involve a great deal of text-based tasks like answering customer questions and resolving their issues.

But is ChatGPT ready for the on-the-ground realities of contact centers? What if it responds inappropriately, abuses a customer, or provides inaccurate information?

We at Quiq pride ourselves on being experts in the domain of customer experience and customer service, and we’ve been watching the recent developments in the realm of generative AI for some time. This piece presents our conclusions about what ChatGPT is, the ways in which ChatGPT can be used for customer service, and the techniques that exist to optimize it for this domain.

What is ChatGPT?

ChatGPT is an application built on top of GPT-4, a large language model. Large language models like GTP-4 are trained on huge amounts of textual data, and they gradually learn the statistical patterns present in that data well enough to output their own, new text.

How does this training work? Well, when you hear a sentence like “I’m going to the store to pick up some _____”, you know that the final word is something like “milk”, “bread”, or “groceries”, and probably not “sawdust” or “livestock”. This is because you’ve been using English for a long time, you’re familiar with what happens at a grocery store, and you have a sense of how a person is likely to describe their exciting adventures there (nothing gets our motor running like picking out the really good avocados).

GPT-4, of course, has none of this context, but if you show it enough examples it can learn to imitate natural language quite well. It will see the first few sentences of a paragraph and try to predict what the final sentence is. At first, its answers are terrible, but with each training run its internal parameters are updated, and it gradually gets better. If you do this for long enough you get something that can write its own emails, blog posts, research reports, book summaries, poems, and codebases.

Is ChatGPT the Same Thing as GPT-4?

So then, how is ChatGPT different from GPT-4? GPT-4 is the large language model trained in the manner just described, and ChatGPT is a version fine-tuned using reinforcement learning with human feedback to be good at conversations.

Fine-tuning refers to a process of taking a pre-trained language model and doing a little extra work to narrow its focus to doing a particular task. A generic LLM can do many things, including write limericks; but if you want it to consistently write high-quality limericks, you’ll need to fine-tune it by showing it a few dozen or a few hundred examples of them.

From that point on it will be specialized for limerick production, and might consequently be less useful for other tasks.

This is how ChatGPT was created. After GPT-3.5 or GPT-4 was finished training, engineers did additional fine-tuning work that led to a model that was especially good at having open-ended interactions with users.

What does ChatGPT mean for Customer Service?

Given that ChatGPT is useful for customer interactions, how might it be deployed in customer service? We believe that a good list of initial use cases includes question answering, personalizing responses to different customers, summarizing important information, translating between languages, and performing sentiment analysis.

This is certainly not everything current and future versions of ChatGPT will be able to do for customer service, but we think it’s a good place to start.

Question Answering

Question answering has long been of such interest to machine learning engineers that there’s a whole bespoke dataset specifically for it (the Stanford Question Answering Dataset, or SQuAD).

It’s not hard to see why. Humans can obviously answer questions, but there are so many possible questions that there’s just no way to get to it all. What if you’d like high-level summaries of all the major research papers published about an obscure scientific sub-discipline? What if you’d like to see how the tone of Victorian-era English novels changed over time? There are only so many person-hours that can go toward digging into queries like this.

Customers, too, have many questions, and answering them takes a lot of time. You could collect all the frequently asked questions and put them into a single document for easy reference, but there are still going to be areas of confusion and requests for clarification (and that’s not even considering the fraction of users who never make it to your FAQ page in the first place).

Automating the process of asking questions is an obvious place to utilize technology like ChatGPT. It’ll never get frustrated answering the same thing thousands of times, it’ll never lose its patience, it’ll never sleep, and it’ll never take a bathroom break.

Vanilla ChatGPT is pretty good at doing this already, and there are already many projects focused on getting it to answer questions about a particular company’s documentation.

This functionality will enable you to field an effectively unlimited number of customer questions while freeing up your contact center agents to tackle more important issues.

Onboarding New Hires

Customers are not the only people who might have questions about your product – new hires unfamiliar with your process for doing things might also have their fair share of confusion.

Even in companies that are very conscious about documentation, there can often be so much to get through that new employees – who already have a lot going on – can feel overwhelmed.

A large language model trained to answer questions about your documentation will be a godsend to the fresh troops you’ve brought in.

Summarization

A related task is summarizing email threads, important technical documents, or even videos.

Just as you can’t realistically expect every customer to assiduously look through all your company’s documents, it’s usually not realistic to expect that all of your own employees will do so either.

Here, too, is a place where ChatGPT can be useful. It’s quite good at taking a lengthy bit of text and summarizing it, so there’s no reason it can’t be used to keep your teams up to speed on what’s happening in parts of the organization that they don’t interact with all that often.

If your engineers don’t want to go over an exchange between product designers, or your marketing team doesn’t want all the details of a conversation between the data scientists, ChatGPT can be used to create summaries of these interactions for easier reading.

This way everyone knows what’s going on throughout the company without needing to spend hours every day staying abreast of evolving issues.

At Quiq, we’ve developed proprietary ways to harness ChaGPT’s generative abilities to summarize conversations for your contact center agents.

Sentiment Analysis

Finally, another way in which ChatGPT will power the contact center of the future is with sentiment analysis. Sentiment analysis refers to a branch of machine learning aimed at parsing the overall tone of a piece of text. This can be more subtle than you might think.

“I hate this restaurant” is pretty unambiguous, but what about a review like “Yeah, we loved this restaurant, we had plenty of time to chat because the food took an hour to come out, and since my enchilada was frozen it counteracted my usual inability to eat spicy food”? You and I can hear the implied eye-rolling in this text, but a machine won’t necessarily be able to unless it’s very powerful.

This matters for contact centers because you need to understand how people are talking about your product, whether that’s in online reviews, internal tickets, or during conversations with your agents.

And ChatGPT can help. It’s not only quite good at sentiment analysis, but it’s also better than quite a lot of alternative machine-learning approaches to sentiment analysis, even without fine-tuning.

(Note, however, that these tests compare it to relatively simple machine learning models, not to the very best deep-learning sentiment analyzers.)

Prioritizing Incoming Issues

One way that ChatGPT can add tremendous value to your contact center is in helping to prioritize issues as they come in. There are always lots of problems to solve, but they’re not all equally important. Finding the most pressing issues and marking them for resolution is a huge part of keeping your center running smoothly.

This is something that humans can do, but there’s only so much energy they can devote to this task. A properly trained generative language model, however, can handle a huge chunk of it, especially when it forms part of a broader suite of AI tools.

One way this could work is using ChatGPT for plucking out essential keywords from a customer service ticket. This by itself might be enough to help your contact center agents figure out what they should focus on, but it can be made even better if these words are then fed to a classification algorithm trained to identify urgent problems.

Real-time Language Translation

Language translation, too, is a clear use case for LLMs, and the deep learning upon which they are based has seen much success in translating from one language to another.

This is especially useful if your product or service enjoys a global audience. Many people have a passing familiarity with English but will not necessarily be able to follow a detailed procedure involving technical vocabulary, and that will be a source of frustration for them.

By substantially or totally automating real-time language translation, ChatGPT can help customers who lack English fluency to better interact with your company’s offerings, answering their questions, resolving their issues, and in general moving them along.

And in case you’re wondering, ChatGPT is currently even better than Google Translate or DeepL at most translation tasks, including tricky ones involving jokes and humor.

Fine-Tuning ChatGPT for Customer Service

So far we’ve mostly talked about ChatGPT out of the box, but we’ve also made some references to “fine-tuning” it.

In this section, we’ll flesh out our earlier comments about fine-tuning ChatGPT, and distinguish fine-tuning from related techniques, like prompt engineering.

What is Fine-Tuning ChatGPT?

Once upon a time, it was anyone’s guess as to whether you’d be able to pre-train a single large model on a dataset and then tweak it for particular applications, or whether you’d need to train a special model for every individual task.

Beginning around 2011, it became increasingly clear that for many applications, pre-training was the way to go, and since then, many techniques have been developed for doing the subsequent fine-tuning.

When you fine-tune a pre-trained generative AI model, you are effectively altering its internal structure so that it does better on the task you’re interested in. Sometimes this involves changing the whole model, other times you’re altering the last few output layers and leaving the rest of the model intact.

But what it ultimately boils down to is creating a fine-tuning pipeline through which your model sees a lot of examples of the behavior you’re trying to elicit. If you were fine-tuning it to be more polite in its follow-up questions, for example, you’d need to collect a bunch of examples of this politeness and have your model learn on them.

How many examples you end up needing will depend on your specific use case, but it’s usually a few dozen and could be as many as a few hundred.

How is Fine-Tuning Different From Prompt Engineering?

Prompt engineering refers to the practice of carefully sculpting the prompt you feed your model to do a better job of producing the output you want to see.

The reason this works is that GPT-4 and other LLMs are extremely sensitive to slight changes in the wording of their prompts. It takes a while to develop the feel required to reliably produce good results with an LLM, and all of this falls under the label of “prompt engineering”.

It’s possible to inject some light fine-tuning into prompt engineering, through one-shot and few-shot learning. One-shot learning means including one example of the behavior you want to see in your prompt, and few-shot learning is the same idea, but you’re including 2-5 examples for the LLM to learn from.

FAQs About ChatGPT for Customer Service

Now that we’ve finished our discussion of the basics of ChatGPT for customer service, we’ll spend some time addressing common questions about this subject.

Can I Use ChatGPT for Customer Service?

Yes! ChatGPT is ideal for customer service applications, but you need to fine-tune ChatGPT on your own company’s documentation or to get it to strike the right tone. With the right guardrails, it’s a powerful tool for those looking to build a forward-looking contact center.

What are the Examples of ChatGPT in Customer Service?

ChatGPT can be used for customer service tasks like question answering, sentiment analysis, translating between natural languages, and summarizing documents. These are all time-intensive tasks, the automation of which will free up your contact center agents to focus on higher-priority work.

Can you Automate Customer Service?

Tools like AutoGPT and SuperAGI are making it easier than ever to create and manage sophisticated agents capable of handling open-ended tasks. Still, artificial intelligence is not yet flexible enough to entirely automate customer service at present.

It can be used to automate substantial parts of customer service, like answering user questions, but for the moment the lion’s share of the work must still be done by flesh-and-blood human beings.

If you’re interested in developments in this space, be sure to follow the Quiq blog for updates.

ChatGPT and the Contact Center of the Future

ChatGPT and related technologies are already changing the way contact centers function. From automated translation to helping field dramatically more questions per hour, they are helping contact center agents be more productive and reducing organizational turnover.

The Quiq platform is an excellent tool for incorporating conversational AI into your offering, without having to hire a team or manage your own infrastructure. Quiq can help you automate text messaging, handle real-time translation, and track the performance of your AI Assistants to see where improvements need to be made.

Exploring Cutting-Edge Research in Large Language Models and Generative AI

By the calendar, ChatGPT was released just a few months ago. But subjectively, it feels as though 600 years have passed since we all read “as a large language model…” for the first time.

The pace of new innovations is staggering, but we at Quiq like to help our audience in the customer experience and contact center industries stay ahead of the curve (even when that requires faster-than-light travel).

Today, we will look at what’s new in generative AI, and what will be coming down the line in the months ahead.

Where will Generative AI be applied?

First, let’s start with industries that will be strongly impacted by generative AI. As we noted in an earlier article, training a large language model (LLM) like ChatGPT mostly boils down to showing it tons of examples of text until it learns a statistical representation of human language well enough to generate sonnets, email copy, and many other linguistic artifacts.

There’s no reason the same basic process (have it learn it from many examples and then create its own) couldn’t be used elsewhere, and in the next few sections, we’re going to look at how generative AI is being used in a variety of different industries to brainstorm structures, new materials, and a billion other things.

Generative AI in Building and Product Design

If you’ve had a chance to play around with DALL-E, Midjourney, or Stable Diffusion, you know that the results can be simply remarkable.

It’s not a far leap to imagine that it might be useful for quickly generating ideas for buildings and products.

The emerging field of AI-generated product design is doing exactly this. With generative image models, designers can use text prompts to rough out ideas and see them brought to life. This allows for faster iteration and quicker turnaround, especially given that creating a proof of concept is one of the slower, more tedious parts of product design.

Image source: Board of Innovation

 

For the same reason, these tools are finding use among architects who are able to quickly transpose between different periods and styles, see how better lighting impacts a room’s aesthetic, and plan around themes like building with eco-friendly materials.

There are two things worth pointing out about this process. First, there’s often a learning curve because it can take a while to figure out prompt engineering well enough to get a compelling image. Second, there’s a hearty dose of serendipity. Often the resulting image will not be quite what the designer had in mind, but it’ll be different in new and productive ways, pushing the artist along fresh trajectories that might never have occurred to them otherwise.

Generative AI in Discovering New Materials

To quote one of America’s most renowned philosophers (Madonna), we’re living in a material world. Humans have been augmenting their surroundings since we first started chipping flint axes back in the Stone Age; today, the field of materials science continues the long tradition of finding new stuff that expands our capabilities and makes our lives better.

This can take the form of something (relatively) simple like researching a better steel alloy, or something incredibly novel like designing a programmable nanomaterial.

There’s just one issue: it’s really, really difficult to do this. It takes a great deal of time, energy, and effort to even identify plausible new materials, to say nothing of the extensive testing and experimenting that must then follow.

Materials scientists have been using machine learning (ML) in their process for some time, but the recent boom in generative AI is driving renewed interest. There are now a number of projects aimed at e.g. using variational autoencoders, recurrent neural networks, and generative adversarial networks to learn a mapping between information about a material’s underlying structure and its final properties, then using this information to create plausible new materials.

It would be hard to overstate how important the use of generative AI in materials science could be. If you imagine the space of possible molecules as being like its own universe, we’ve explored basically none of it. What new fabrics, medicines, fuels, fertilizers, conductors, insulators, and chemicals are waiting out there? With generative AI, we’ve got a better chance than ever of finding out.

Generative AI in Gaming

Gaming is often an obvious place to use new technology, and that’s true for generative AI as well. The principles of generative design we discussed two sections ago could be used in this context to flesh out worlds, costumes, weapons, and more, but it can also be used to make character interactions more dynamic.

From Navi trying to get our attention in Ocarina of Time to GlaDOS’s continual reminders that “the cake is a lie” in Portal, non-playable characters (NPCs) have always added texture and context to our favorite games.

Powered by LLMs, these characters may soon be able to have open-ended conversations with players, adding more immersive realism to the gameplay. Rather than pulling from a limited set of responses, they’d be able to query LLMs to provide advice, answer questions, and shoot the breeze.

What’s Next in Generative AI?

As impressive as technologies like ChatGPT are, people are already looking for ways to extend their capabilities. Now that we’ve covered some of the major applications of generative AI, let’s look at some of the exciting applications people are building on top of it.

What is AutoGPT and how Does it Work?

ChatGPT can already do things like generate API calls and build simple apps, but as long as a human has to actually copy and paste the code somewhere useful, its capacities are limited.

But what if that weren’t an issue? What if it were possible to spin ChatGPT up into something more like an agent, capable of semi-autonomously interacting with software or online services to complete strings of tasks?

This is exactly what Auto-GPT is intended to accomplish. Auto-GPT is an application built by developer Toran Bruce Richards, and it is comprised of two parts: an LLM (either GPT-3.5 or GPT-4), and a separate “bot” that works with the LLM.

By repeatedly querying the LLM, the bot is able to take a relatively high-level task like “help me set up an online business with a blog and a website” or “find me all the latest research on quantum computing”, decompose it into discrete, achievable steps, then iteratively execute them until the overall objective is achieved.

At present, Auto-GPT remains fairly primitive. Just as ChatGPT can get stuck in repetitive and unhelpful loops, so too can Auto-GPT. Still, it’s a remarkable advance, and it’s spawned a series of other projects attempting to do the same thing in a more consistent way.

The creators of AssistGPT bill it as a “General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn”. It handles multi-modal tasks (i.e. tasks that rely on vision or sound and not just text) better than Auto-GPT, and by integrating with a suite of tools it is able to achieve objectives that involve many intermediate steps and sub-tasks.

SuperAGI, in turn, is just as ambitious. It’s a platform that offers a way to quickly create, deploy, manage, and update autonomous agents. You can integrate them into applications like Slack or vector databases, and it’ll even ping you if an agent gets stuck somewhere and starts looping unproductively.

Finally, there’s LangChain, which is a similar idea. LangChain is a framework that is geared towards making it easier to build on top of LLMs. It features a set of primitives that can be stitched into more robust functionality (not unlike “for” and “while” loops in programming languages), and it’s even possible to build your own version of AutoGPT using LangChain.

What is Chain-of-Thought Prompting and How Does it Work?

In the misty, forgotten past (i.e. 5 months ago), LLMs were famously bad at simple arithmetic. They might be able to construct elegant mathematical proofs, but if you asked them what 7 + 4 is, there was a decent chance they’d get it wrong.

Chain-of-thought (COT) prompting refers to a few-shot learning method of eliciting output from an LLM that compels it to reason in a step-by-step way, and it was developed in part to help with this issue. This image from the original Wei et al. (2022) paper illustrates how:

Input and output examples for Standard and Chain-of-thought Prompting.
Source: ARXIV.org

As you can see, the model’s performance is improved because it’s being shown a chain of different thoughts, hence chain-of-thought.

This technique isn’t just useful for arithmetic, it can be utilized to get better output from a model in a variety of different tasks, including commonsense and symbolic reasoning.

In a way, humans can be prompt engineered in the same fashion. You can often get better answers out of yourself or others through a deliberate attempt to reason slowly, step-by-step, so it’s not a terrible shock that a large model trained on human text would benefit from the same procedure.

The Ecosystem Around Generative AI

Though cutting-edge models are usually the stars of the show, the truth is advanced technologies aren’t worth much if you have to be deeply into the weeds to use them. Machine learning, for example, would surely be much less prevalent if tools like sklearn, Tensorflow, and Keras didn’t exist.

Though we’re still in the early days of LLMs, AutoGPT, and everything else we’ve discussed, we suspect the same basic dynamic will play out. Since it’s now clear that these models aren’t toys, people will begin building infrastructure around them that streamlines the process of training them for specific use cases, integrating them into existing applications, etc.

Let’s discuss a few efforts in this direction that are already underway.

Training and Education

Among the simplest parts of the emerging generative AI value chain is exactly what we’re doing now: talking about it in an informed way. Non-specialists will often lack the time, context, and patience required to sort the real breakthroughs from the hype, so putting together blog posts, tutorials, and reports that make this easier is a real service.

Making Foundation Models Available

“Foundation models” is a new term that refers to the actual algorithms that underlie LLMs. ChatGPT, for example, is not a foundation model. GPT-4 is the foundation model, and ChatGPT is a specialized application of it (more on this shortly).

Companies like Anthropic, Google, and OpenAI can train these gargantuan models and then make them available through an API. From there, developers are able to access their preferred foundation model over an API.

This means that we can move quickly to utilize their remarkable functionality, which wouldn’t be the case if every company had to train their own from scratch.

Building Applications Around Specific Use Cases

One of the most striking properties of ChatGPT is how amazingly general they are. They are capable of “…generating functioning web apps with just a few prompts, writing Spanish-language children’s stories about the blockchain in the style of Dr. Suess, [and] opining on the virtues and vices of major political figures”, to name but a few examples.

General-purpose models often have to be fine-tuned to perform better on a specific task, especially if they’re doing something tricky like summarizing medical documents with lots of obscure vocabulary. Alas, there is a tradeoff here, because in most cases these fine-tuned models will afterward not be as useful for generic tasks.

The issue, however, is that you need a fair bit of technical skill to set up a fine-tuning pipeline, and you need a fair bit of elbow grease to assemble the few hundred examples a model needs in order to be fine-tuned. Though this is much simpler than training a model in the first place it is still far from trivial, and we expect that there will soon be services aimed at making it much more straightforward.

LLMOps and Model Hubs

We’d venture to guess you’ve heard of machine learning, but you might not be familiar with the term “MLOps”. “Ops” means “operations”, and it refers to all the things you have to do to use a machine learning model besides just training it. Once a model has been trained it has to be monitored, for example, because sometimes its performance will begin to inexplicably degrade.

The same will be true of LLMs. You’ll need to make sure that the chatbot you’ve deployed hasn’t begun abusing customers and damaging your brand, or that the deep learning tool you’re using to explore new materials hasn’t begun to spit out gibberish.

Another phenomenon from machine learning we think will be echoed in LLMs is the existence of “model hubs”, which are places where you can find pre-trained or fine-tuned models to use. There certainly are carefully guarded secrets among technologists, but on the whole, we’re a community that believes in sharing. The same ethos that powers the open-source movement will be found among the teams building LLMs, and indeed there are already open-sourced alternatives to ChatGPT that are highly performant.

Looking Ahead

As they’re so fond of saying on Twitter, “ChatGPT is just the tip of the iceberg.” It’s already begun transforming contact centers, boosting productivity among lower-skilled workers while reducing employee turnover, but research into even better tools is screaming ahead.

Frankly, it can be enough to make your head spin. If LLMs and generative AI are things you want to incorporate into your own product offering, you can skip the heady technical stuff and skip straight to letting Quiq do it for you. The Quiq conversational AI platform is a best-in-class product suite that makes it much easier to utilize these technologies. Schedule a demo to see how we can help you get in on the AI revolution.

How to Evaluate Generated Text and Model Performance

Machine learning is an incredibly powerful technology. That’s why it’s being used in everything from autonomous vehicles to medical diagnoses to the sophisticated, dynamic AI Assistants that are handling customer interactions in modern contact centers.

But for all this, it isn’t magic. The engineers who build these systems must know a great deal about how to evaluate them. How do you know when a model is performing as expected, or when it has begun to overfit the data? How can you tell when one model is better than another?

This subject will be our focus today. We’ll cover the basics of evaluating a machine learning model with metrics like mean squared error and accuracy, then turn our attention to the more specialized task of evaluating the generated text of a large language model like ChatGPT.

How to Measure the Performance of a Machine Learning Model?

A machine learning model is always aimed at some task. It might be trying to fit a regression line that helps predict the future price of Bitcoin, it might be clustering documents according to their topics, or it might be trying to generate text so good it rivals that produced by humans.

How does the model know when it’s gotten the optimal line or discovered the best way to cluster documents? (And more importantly, how do you know?)

In the next few sections, we’ll talk about a few common ways of evaluating the performance of a machine-learning model. If you’re an engineer this will help you create better models yourself, and if you’re a layperson, it’ll help you better understand how the machine-learning pipeline works.

Evaluation Metrics for Regression Models

Regression is one of the two big types of basic machine learning, with the other being classification.

In tech-speak, we say that the purpose of a regression model is to learn a function that maps a set of input features to a real value (where “real” just means “real numbers”). This is not as scary as it sounds; you might try to create a regression model that predicts the number of sales you can expect given that you’ve spent a certain amount on advertising, or you might try to predict how long a person will live on the basis of their daily exercise, water intake, and diet.

In each case, you’ve got a set of input features (advertising spend or daily habits), and you’re trying to predict a target variable (sales, life expectancy).

The relationship between the two is captured by a model, and a model’s quality is evaluated with a metric. Popular metrics for regression models include the mean squared error, the root mean squared error, and the mean absolute error (though there are plenty of others if you feel like going down a nerdy rabbit hole).

The mean squared error (MSE) quantifies how good a regression model is by calculating the difference between the line and each real data point, squaring them (so that positive and negative differences don’t cancel out), and then averaging them. This gives a single number that the training algorithm can use to adjust its model – if the MSE is going down, the model is getting better, if it’s going up, it’s getting worse.

The root mean squared error (RMSE) does the exact same thing, but the final step is that you take the square root of the MSE. The big advantage here is that it converts the units of your metric back into the units you’re using in your problem (i.e. the “squared dollars” of MSE become “dollars” again, which makes it easier to think about what’s going on).

The mean absolute error (MAE) is the same basic idea, but it uses absolute values instead of squares. This also has the advantage of not penalizing outliers as much as the RMSE does. If you’ve got some outlier data point that’s far away from your model, squaring the difference will result in a bigger error than simply taking the absolute value of that difference. For this reason, it’s less sensitive to outliers in the dataset.

Evaluation Metrics for Classification Models

People tend to struggle less with understanding classification models because it’s more intuitive: you’re building something that can take a data point (the price of an item) and sort it into one of a number of different categories (i.e. “cheap”, “somewhat expensive”, “expensive”, “very expensive”).

Of course, the categories you choose will depend on the problem you’re trying to solve and the domain you’re operating in – a $100 apple is certainly “very expensive”, but a $100 dollar wedding ring…will probably get you left at the altar.

Regardless, it’s just as essential to evaluate the performance of a classification model as it is to evaluate the performance of a regression model. Some common evaluation metrics for classification models are accuracy, precision, and recall.

Accuracy is simple, and it’s exactly what it sounds like. You find the accuracy of a classification model by dividing the number of correct predictions it made by the total number of predictions it made altogether. If your classification model made 1,000 predictions and got 941 of them right, that’s an accuracy rate of 94.1% (not bad!)

Both precision and recall are subtler variants of this same idea. The precision is the number of true positives (correct classifications) divided by the sum of true positives and false positives (incorrect positive classifications). It says, in effect, “When your model thought it had identified a needle in a haystack, this is how often it was correct.”

The recall is the number of true positives divided by the sum of true positives and false negatives (incorrect negative classifications). It says, in effect “There were 200 needles in this haystack, and your model found 72% of them.”

Accuracy tells you how well your model performed overall, precision tells you how confident you can be in its positive classifications, and recall tells you how often it found the positive classifications.

(You may be wondering if this isn’t overkill. Do we really need all these different ratios? Answering that question fully would take us too far from our purpose of measuring the quality of text from generative AI models, but suffice it to say that there are trade-offs involved. Sometimes it makes more sense to focus on boosting the precision, other times getting a higher recall is more important. These are all just different tools for figuring out how to spend your limited time and energy to get a model that best solves your problem.)

Contact Us

How Can I Assess the Performance of a Generative AI Model?

Now, we arrive at the center of this article. Everything up to now has been background context that hopefully has given you a feel for how models are evaluated, because from here on out it’s a bit more abstract.

Using Reference Text for Evaluating Generative Models

When we wanted to evaluate a regression model, we started by looking at how far its predictions were from actual data points.

Well, we do essentially the same thing with generative language models. To assess the quality of text generated by a model, we’ll compare it against high-quality text that’s been selected by domain experts.

The Bilingual Evaluation Understudy (BLEU) Score

The BLEU score can be used to actually quantify the distance between the generated and reference text. It does this by comparing the amount of overlap in the n-grams [1] between the two using a series of weighted precision scores.

The BLEU score varies from 0 to 1. A score of “0” indicates that there is no n-gram overlap between the generated and reference text, and the model’s output is considered to be of low quality. A score of “1”, conversely, indicates that there is total overlap between the generated and reference text, and the model’s output is considered to be of high quality.

Comparing BLEU scores across different sets of reference texts or different natural languages is so tricky that it’s considered best to avoid it altogether.

Also, be aware that the BLEU score contains a “brevity penalty” which discourages the model from being too concise. If the model’s output is too much shorter than the reference text, this counts as a strike against it.

The Recall-Oriented Understudy for Gisting Evaluation (ROGUE) Score

Like the BLEU score, the ROGUE score is examining the n-gram overlap between an output text and a reference text. Unlike the BLEU score, however, it uses recall instead of precision.

There are three types of ROGUE scores:

  1. rogue-n: Rogue-n is the most common type of ROGUE score, and it simply looks at n-gram overlap, as described above.
  2. rogue-l: Rogue-l looks at the “Longest Common Subsequence” (LCS), or the longest chain of tokens that the reference and output text share. The longer the LCS, of course, the more the two have in common.
  3. rogue-s: This is the least commonly-used variant of the ROGUE score, but it’s worth hearing about. Rogue-s concentrates on the “skip-grams” [2] that the two texts have in common. Rogue-s would count “He bought the house” and “He bought the blue house” as overlapping because they have the same words in the same order, despite the fact that the second sentence does have an additional adjective.

The Metric for Evaluation of Translation with Explicit Ordering (METEOR) Score

The METEOR Score takes the harmonic mean of the precision and recall scores for 1-gram overlap between the output and reference text. It puts more weight on recall than on precision, and it’s intended to address some of the deficiencies of the BLEU and ROGUE scores while maintaining a pretty close match to how expert humans assess the quality of model-generated output.

BERT Score

At this point, it may have occurred to you to wonder whether the BLEU and ROGUE scores are actually doing a good job of evaluating the performance of a generative language model. They look at exact n-gram overlaps, and most of the time, we don’t really care that the model’s output is exactly the same as the reference text – it needs to be at least as good, without having to be the same.

The BERT score is meant to address this concern through contextual embeddings. By looking at the embeddings behind the sentences and comparing those, the BERT score is able to see that “He quickly ate the treats” and “He rapidly consumed the goodies” are expressing basically the same idea, while both the BLEU and ROGUE scores would completely miss this.

Final thoughts.

We’ve all seen what generative AI can do, and it’s fair at this point to assume this technology is going to become more prevalent in fields like software engineering, customer service, customer experience, and marketing.

But, as magical as generative AI might seem to be, they’re just models. They have to be evaluated and monitored just like any other, or you risk having a bad one negatively impact your brand.

If you’re enchanted by the potential of using generative algorithms in your contact center but are daunted by the challenge of putting together an engineering team, reach out to us for a demo of the Quiq conversational CX platform. We can help you put this cutting-edge technology to work without having to worry about all the finer details and resourcing issues.

***

Footnotes

[1] An n-gram is just a sequence of characters, words, or entire sentences. A 1-gram is usually single words, a 2-gram is usually two words, etc.
[2] Skip-grams are a rather involved subdomain of natural language processing. You can read more about them in this article, but frankly, most of it is irrelevant to this article. All you need to know is that the rogue-s score is set up to be less concerned with exact n-gram overlaps than the alternatives.

How to Get the Most out of Your NLP Models with Preprocessing

Along with computer vision, natural language processing (NLP) is one of the great triumphs of modern machine learning. While ChatGPT is all the rage and large language models (LLMs) are drawing everyone’s attention, that doesn’t mean that the rest of the NLP field just goes away.

NLP endeavors to apply computation to human-generated language, whether that be the spoken word or text existing in places like Wikipedia. There are any number of ways in which this would be relevant to customer experience and service leaders, including:

Today, we’re going to briefly touch on what NLP is, but we’ll spend the bulk of our time discussing how textual training data can be preprocessed to get the most out of an NLP system. There are a few branches of NLP, like speech synthesis and text-to-speech, which we’ll be omitting.

Armed with this context, you’ll be better prepared to evaluate using NLP in your business (though if you’re building customer-facing chatbots, you can also let the Quiq platform do the heavy lifting for you).

What is Natural Language Processing?

In the past, we’ve jokingly referred to NLP as “doing computer stuff with words after you’ve tricked them into being math.” This is meant to be humorous, but it does capture the basic essence.

Remember, your computer doesn’t know what words are, all it does is move 1’s and 0’s around. A crucial step in most NLP applications, therefore, is creating a numerical representation out of the words in your training corpus.

There are many ways of doing this, but today a popular method is using word vector embeddings. Also known simply as “embeddings”, these are vectors of real numbers. They come from a neural network or a statistical algorithm like word2vec and stand in for particular words.

The technical details of this process don’t concern us in this post, what’s important is that you end up with vectors that capture a remarkable amount of semantic information. Words with similar meanings also have similar vectors, for example, so you can do things like find synonyms for a word by finding vectors that are mathematically close to it.

These embeddings are the basic data structures used across most of NLP. They power sentiment analysis, topic modeling, and many other applications.

For most projects it’s enough to use pre-existing word vector embeddings without going through the trouble of generating them yourself.

Are Large Language Models Natural Language Processing?

Large language models (LLMs) are a subset of natural language processing. Training an LLM draws on many of the same techniques and best practices as the rest of NLP, but NLP also addresses a wide variety of other language-based tasks.

Conversational AI is a great case in point. One way of building a conversational agent is by hooking your application up to an LLM like ChatGPT, but you can also do it with a rules-based approach, through grounded learning, or with an ensemble that weaves together several methods.

Getting the Most out of Your NLP Models with Preprocessing

Data Preprocessing for NLP

If you’ve ever sent a well-meaning text that was misinterpreted, you know that language is messy. For this reason, NLP places special demands on the data engineers and data scientists who must transform text in various ways before machine learning algorithms can be trained on it.

In the next few sections, we’ll offer a fairly comprehensive overview of data preprocessing for NLP. This will not cover everything you might encounter in the course of preparing data for your NLP application, but it should be more than enough to get started.

Why is Data Preprocessing Important?

They say that data is the new oil, and just as you can’t put oil directly in your gas tank and expect your car to run, you can’t plow a bunch of garbled, poorly-formatted language data into your algorithms and expect magic to come out the other side.

But what, precisely, counts as preprocessing will depend on your goals. You might choose to omit or include emojis, for example, depending on whether you’re training a model to summarize academic papers or write tweets for you.

That having been said, there are certain steps you can almost always expect to take, including standardizing the case of your language data, removing punctuation, white spaces and stop words, segmenting and tokenizing, etc.

We treat each of these common techniques below.

Segmentation and Tokenization

An NLP model is always trained on some consistent chunk of the full data. When ChatGPT was trained, for example, they didn’t put the entire internet in a big truck and back it up to a server farm, they used self-supervised learning.

Simplifying greatly, this means that the underlying algorithm would take, say, the first few three sentences of a paragraph and then try to predict the remaining sentence on the basis of the text that came before. Over time it sees enough language to guess that “to be or not to be, that is ___ ________” ends with “the question.”

But how was ChatGPT shown the first three sentences? How does that process even work?

A big part of the answer is segmentation and tokenization.

With segmentation, we’re breaking a full corpus of training text – which might contain hundreds of books and millions of words – down into units like words or sentences.

This is far from trivial. In English, sentences end with a period, but words like “Mr.” and “etc.” also contain them. It can be a real challenge to divide text into sentences without also breaking “Mr. Smith is cooking the steak.” into “Mr.” and “Smith is cooking the steak.”

Tokenization is a related process of breaking a corpus down into tokens. Tokens are sometimes described as words, but in truth they can be words, short clusters of a few words, sub-words, or even individual characters.

This matters a lot to the training of your NLP model. You could train a generative language model to predict the next sentence based on the preceding sentences, the next word based on the preceding words, or the next character based on the preceding characters.

Regardless, in both segmentation and tokenization, you’re decomposing a whole bunch of text down into individual units that your algorithm can work with.

Making the Case Consistent

It’s standard practice to make the case of your text consistent throughout, as this makes training simpler. This is usually done by lowercasing all the text, though we suppose if you’re feeling rebellious there’s no reason you couldn’t uppercase it (but the NLP engineers might not invite you to their fun Natural Language Parties if you do.)

Fixing Misspellings

NLP, like machine learning more generally, is only as good as its data. If you feed it text with a lot of errors in spelling, it will learn those errors and they’ll show up again later.

This probably isn’t something you’ll want to do manually, and if you’re using a popular language there’s likely a module you can use to do this for you. Python, for example, has TextBlob, Autocorrect, and Pyspellchecker libraries that can handle spelling errors.

Getting Rid of the Punctuation Marks

Natural language tends to have a lot of punctuation, with English utilizing dozens of marks such as ‘!’ and ‘;’ for emphasis and clarification. These are usually removed as part of preprocessing.

This task is something that can be handled with regular expressions (if you have the patience for it…), or you can do it with an NLP library like Natural Language Toolkit (NLTK).

Expanding the Contractions

Contractions are shortened versions of words, like turning “do not” into “don’t” or “would not” into “wouldn’t”. These, too, can be problematic for NLP algorithms and are usually removed during preprocessing.

Stemming

In linguistics, the stem of a word is its root. The words “runs”, “ran”, and “running” all have the word “run” as their base.

Stemming is one of two approaches for reducing the myriad tenses of a word down into a single basic representation. The other is lemmatization, which we’ll discuss in the next section.

Stemming is the cruder of the two, and is usually done with an algorithm known as Porter’s Stemmer. This stemmer doesn’t always produce the stem you’d expect. “Cats” becomes “cat” while “ponies” becomes “poni”, for example. Nevertheless, this is probably sufficient for basic NLP tasks.

Lemmatization

A more sophisticated version of stemming is lemmatization. A stemmer wouldn’t know the difference between the word “left” in “cookies are ahead and to the left” and “he left the book on the table”, whereas a lemmatizer would.

More generally, a lemmatizer uses language-specific context to handle very subtle distinctions between words, and this means it will usually take longer to run than a stemmer.

Whether it makes sense to use a stemmer or a lemmatizer will depend on the use case you’re interested in. Under most circumstances, lemmatizers are more accurate, and stemmers are faster.

Removing Extra White Spaces

It’ll often be the case that a corpus will have an inconsistent set of spacing conventions. This, too, is something algorithm will learn unless it’s remedied during preprocessing.
Removing Stopwords

This is a big one. “Stopwords” are words like “the” or “is” are all stopwords, and they’re almost always removed before training begins because they don’t add much in the way of useful information.

Because this is done so commonly, you can assume that the NLP library you’re using will have some easy way of doing it. NLTK, for example, has a native list of stopwords that can simply be imported:

from nltk.corpus import stopwords

With this, you can simply exclude the stopwords from the corpus.

Ditching the Digits

If you’re building an NLP application that processes data containing numbers, you’ll probably want to remove that as the training algorithm might end up inserting random digits here and there.

This, alas, is something that will probably need to be done with regular expressions.

Part of Speech Tagging

Part of speech tagging refers to the process of automatically tagging a word with extra grammatical information about whether it’s a noun, verb, etc.

This is certainly not something that you always have to do (we’ve completed a number of NLP projects where it never came up), but it’s still worth understanding what it is.

Supercharging Your NLP Applications

Natural language processing is an enormously powerful constellation of techniques that allow computers to do worthwhile work on text data. It can be used to build question-answering systems, tutors, chatbots, and much more.

But to get the most out of it, you’ll need to preprocess the data. No matter how much computing you have access to, machine learning isn’t of much use with bad data. Techniques like removing stopwords, expanding contractions, and lemmatization create corpora of text that can then be fed to NLP algorithms.

Of course, there’s always an easier way. If you’d rather skip straight to the part where cutting-edge conversational AI directly adds value to your business, you can also reach out to see what the Quiq platform can do.

What Is Transfer Learning? – The Role of Transfer Learning in Building Powerful Generative AI Models

Machine learning is hard work. Sure, it only takes a few minutes to knock out a simple tutorial where you’re training an image classifier on the famous iris dataset, but training a big model to do something truly valuable – like interacting with customers over a chat interface – is a much greater challenge.

Transfer learning offers one possible solution to this problem. By making it possible to train a model in one domain and reuse it in another, transfer learning can reduce demands on your engineering team by a substantial amount.

Today, we’re going to get into transfer learning, defining what it is, how it works, where it can be applied, and the advantages it offers.

Let’s get going!

What is Transfer Learning in AI?

In the abstract, transfer learning refers to any situation in which knowledge from one task, problem, or domain is transferred to another. If you learn how to play the guitar well and then successfully use those same skills to pick up a mandolin, that’s an example of transfer learning.

Speaking specifically about machine learning and artificial intelligence, the idea is very similar. Transfer learning is when you pre-train a model on one task or dataset and then figure out a way to reuse it for another (we’ll talk about methods later).

If you train an image model, for example, it will tend to learn certain low-level features (like curves, edges, and lines) that show up in pretty much all images. This means you could fine-tune the pre-trained model to do something more specialized, like recognizing faces.

Why Transfer Learning is Important in Deep Learning Models

Building a deep neural network requires serious expertise, especially if you’re doing something truly novel or untried.

Transfer learning, while far from trivial, is simply not as taxing. GPT-4 is the kind of project that could only have been tackled by some of Earth’s best engineers, but setting up a fine-tuning pipeline to get it to do good sentiment analysis is a much simpler job.

By lowering the barrier to entry, transfer learning brings advanced AI into reach for a much broader swath of people. For this reason alone, it’s an important development.

Transfer Learning vs. Fine-Tuning

And speaking of fine-tuning, it’s natural to wonder how it’s different from transfer learning.

The simple answer is that fine-tuning is a kind of transfer learning. Transfer learning is a broader concept, and there are other ways to approach it besides fine-tuning.

What are the 5 Types of Transfer Learning?

Broadly speaking, there are five major types of transfer learning, which we’ll discuss in the following sections.

Domain Adaptation

Under the hood, most modern machine learning is really just an application of statistics to particular datasets.

The distribution of the data a particular model sees, therefore, matters a lot. Domain adaptation refers to a family of transfer learning techniques in which a model is (hopefully) trained such that it’s able to handle a shift in distributions from one domain to another (see section 5 of this paper for more technical details).

Domain Confusion

Earlier, we referenced the fact that the layers of a neural network can learn representations of particular features – one layer might be good at detecting curves in images, for example.

It’s possible to structure our training such that a model learns more domain invariant features, i.e. features that are likely to show up across multiple domains of interest. This is known as domain confusion because, in effect, we’re making the domains as similar as possible.

Multitask Learning

Multitask learning is arguably not even a type of transfer learning, but it came up repeatedly in our research, so we’re adding a section about it here.

Multitask learning is what it sounds like; rather than simply training a model on a single task (i.e. detecting humans in images), you attempt to train it to do several things at once.

The debate about whether multitask learning is really transfer learning stems from the fact that transfer learning generally revolves around adapting a pre-trained model to a new task, rather than having it learn to do more than one thing at a time.

One-Shot Learning

One thing that distinguishes machine learning from human learning is that the former requires much more data. A human child will probably only need to see two or three apples before they learn to tell apples from oranges, but an ML model might need to see thousands of examples of each.

But what if that weren’t necessary? The field of one-shot learning addresses itself to the task of learning e.g. object categories from either one example or a small number of them. This idea was pioneered in “One-Shot Learning of Object Categories”, a watershed paper co-authored by Fei-Fei Li and her collaborators. Their Bayesian one-shot learner was able to “…to incorporate prior knowledge of the object world into the learning scheme”, and it outperformed a variety of other models in object recognition tasks.

Zero-Shot Learning

Of course, there might be other tasks (like translating a rare or endangered language), for which it is effectively impossible to have any labeled data for a model to train on. In such a case, you’d want to use zero-shot learning, which is a type of transfer learning.

With zero-shot learning, the basic idea is to learn features in one data set (like images of cats) that allow successful performance on a different data set (like images of dogs). Humans have little problem with this, because we’re able to rapidly learn similarities between types of entities. We can see that dogs and cats both have tails, both have fur, etc. Machines can perform the same feat if the data is structured correctly.

How Does Transfer Learning Work?

There are a few different ways you can go about utilizing transfer learning processes in your own projects.

Perhaps the most basic is to use a good pre-trained model off the shelf as a feature extractor. This would mean keeping the pre-trained model in place, but then replacing its final layer with a layer custom-built for your purposes. You could take the famous AlexNet image classifier, remove its last classification layer, and replace it with your own, for example.

Or, you could fine-tune the pre-trained model instead. This is a more involved engineering task and requires that the pre-trained model be modified internally to be better suited to a narrower application. This will often mean that you have to freeze certain layers in your model so that the weights don’t change, while simultaneously allowing the weights in other layers to change.

What are the Applications of Transfer Learning?

As machine learning and deep learning have grown in importance, so too has transfer learning become more crucial. It now shows up in a variety of different industries. The following are some high-level indications of where you might see transfer learning being applied.

Speech recognition across languages: Teaching machines to recognize and process spoken language is an important area of AI research and will be of special interest to those who operate contact centers. Transfer learning can be used to take a model trained in a language like French and repurpose it for Spanish.

Training general-purpose game engines: If you’ve spent any time playing games like chess or go, you know that they’re fairly different. But, at a high enough level of abstraction, they still share many features in common. That’s why transfer learning can be used to train up a model on one game and, under certain conditions, use it in another.

Object recognition and segmentation: Our Jetsons-like future will take a lot longer to get here if our robots can’t learn to distinguish between basic objects. This is why object recognition and object segmentation are both such important areas of research. Transfer learning is one way of speeding up this process. If models can learn to recognize dogs and then quickly be re-purposed for recognizing muffins, then we’ll soon be able to outsource both pet care and cooking breakfast.

transfer_learning_chihuahua
In fairness to the AI, it’s not like we can really tell them apart!

Applying Natural Language Processing: For a long time, computer vision was the major use case of high-end, high-performance AI. But with the release of ChatGPT and other large language models, NLP has taken center stage. Because much of the modern NLP pipeline involves word vector embeddings, it’s often possible to use a baseline, pre-trained NLP model in applications like topic modeling, document classification, or spicing up your chatbot so it doesn’t sound so much like a machine.

What are the Benefits of Transfer Learning?

Transfer learning has become so popular precisely because it offers so many advantages.

For one thing, it can dramatically reduce the amount of time it takes to train a new model. Because you’re using a pre-trained model as the foundation for a new, task-specific model, far fewer engineering hours have to be spent to get good results.

There are also a variety of situations in which transfer learning can actually improve performance. If you’re using a good pre-trained model that was trained on a general enough dataset, many of the features it learned will carry over to the new task.

This is especially true if you’re working in a domain where there is relatively little data to work with. It might simply not be possible to train a big, cutting-edge model on a limited dataset, but it will often be possible to use a pre-trained model that is fine-tuned on that limited dataset.

What’s more, transfer learning can work to prevent the ever-present problem of overfitting. Overfitting has several definitions depending on what resource you consult, but a common way of thinking about it is when the model is complex enough relative to the data that it begins learning noise instead of just signal.

That means that it may do spectacularly well in training only to generalize poorly when it’s shown fresh data. Transfer learning doesn’t completely rule out this possibility, but it makes it less likely to happen.

Transfer learning also has the advantage of being quite flexible. You can use transfer learning for everything from computer vision to natural language processing, and many domains besides.

Relatedly, transfer learning makes it possible for your model to expand into new frontiers. When done correctly, a pre-trained model can be deployed to solve an entirely new problem, even when the underlying data is very different from what it was shown before.

When To Use Transfer Learning

The list of benefits we just enumerated also offers a clue as to when it makes sense to use transfer learning.

Basically, you should consider using transfer learning whenever you have limited data, limited computing resources, or limited engineering brain cycles you can throw at a problem. This will often wind up being the case, so whenever you’re setting your sights on a new goal, it can make sense to spend some time seeing if you can’t get there more quickly by simply using transfer learning instead of training a bespoke model from scratch.

Check out the second video in Quiq’s LLM Intuitions series—created by our Head of AI, Kyle McIntyre—to learn about one of the oldest forms of transfer learning: Word embeddings.

Transfer Learning and You

In the contact center space, we understand how difficult it can be to effectively apply new technologies to solve our problems. It’s one thing to put together a model for a school project, and quite another to have it tactfully respond to customers who might be frustrated or confused.

Transfer learning is one way that you can get more bang for your engineering buck. By training a model on one task or dataset and using it on another, you can reduce your technical budget while still getting great results.

You could also just rely on us to transfer our decades of learning on your behalf (see what we did there). We’ve built an industry-leading conversational AI chat platform that is changing the game in contact centers. Reach out today to see how Quiq can help you leverage the latest advances in AI, without the hassle.

How Generative AI is Supercharging Contact Center Agents

If you’re reading this, you’ve probably had a chance to play around with ChatGPT or one of the other large language models (LLMs) that have been making waves and headlines in recent months.

Concerns around automation go back a long way, but there’s long been extra worry about the possibility that machines will make human labor redundant. If you’ve used generative AI to draft blog posts or answer technical questions, it’s natural to wonder if perhaps algorithms will soon be poised to replace humans in places like contact centers.

Given how new these LLMs are there has been little scholarship on how they’ve changed the way contact centers function. But “Generative AI at Work” by Erik Brynjolfsson, Danielle Li, and Lindsey R. Raymond took aim at exactly this question.

The results are remarkable. They found that access to tools like ChatGPT not only led to a marked increase in productivity among the lowest-skilled workers, it also had positive impacts on other organizational metrics, like reducing turnover.

Today, we’re going to break this economic study down, examining its methods, its conclusions, and what they mean for the contact centers of the future.

Let’s dig in!

A Look At “Generative AI At Work”

The paper studies data from the use of a conversational AI assistant by a little over 5,000 agents working in customer support.

It contains several major sections, beginning with a technical primer on what generative AI is and how it works before moving on to a discussion of the study’s methods and results.

What is Generative AI?

Covering the technical fundamentals of generative AI will inform our efforts to understand the ways in which this AI technology affected work in the study, as well as how it is likely to do so in future deployments.

A good way to do this is to first grasp how traditional, rules-based programming works, then contrast this with generative AI.

When you write a computer program, you’re essentially creating a logical structure that furnishes instructions the computer can execute.

To take a simple case, you might try to reverse a string such as “Hello world”. One way to do this explicitly is to write code in a language like Python which essentially says:

“Create a new, empty list, then start at the end of the string we gave you and work forward, successively adding each character you encounter to that list before joining all the characters into a reversed string”:

Python code demonstrating a reverse string.

Despite the fact that these are fairly basic instructions, it’s possible to weave them into software that can steer satellites and run banking infrastructure.

But this approach is not suitable for every kind of problem. If you’re trying to programmatically identify pictures of roses, for example, it’s effectively impossible to do this with rules like the ones we used to reverse the string.

Machine learning, however, doesn’t even try to explicitly define any such rules. It works instead by feeding a model many pictures of roses, and “training” it to learn a function that lets it identify new pictures of roses it has never seen before.

Generative AI is a kind of machine learning in which gargantuan models are trained on mind-boggling amounts of text data until they’re able to produce their own, new text. Generative AI is a distinct sub-branch of ML because its purpose is generation, while other kinds of models might be aimed at tasks like classification and prediction.

Is Generative AI The Same Thing As Large Language Models?

At this point, you might be wondering how whether generative AI is the same thing as LLMs. With all the hype and movement in the space, it’s easy to lose track of the terminology.

LLMs are a subset of the broader category of generative AI. All LLMs are generative AI, but there are generative algorithms that work with images, music, chess moves, and other things besides natural language.

How Did The Researchers Study the Effects of Generative AI on Work?

Now we understand that ML learns to recognize patterns, how this is different from classical computer programming, and how generative AI fits into the whole picture.

We can now get to the meat of the study, beginning with how Brynjolfsson, Li, and Raymond actually studied the use of generative AI by workers at a contact center.

The firm from which they drew their data is a Fortune 500 company that creates enterprise software. Its support agents are located mainly in the Phillippines (with a smaller number in the U.S.) to resolve customer issues via a chat interface.

Most of the agent’s job boils down to answering questions from the owners of small businesses that use the firm’s software. Their productivity is assessed via how long it takes them to resolve a given issue (“average handle time”), the fraction of total issues a given agent is able to resolve to the customer’s satisfaction (“resolution rate”), and the net number of customers who would recommend the agent (“net promoter score.”)

Line graphs showing handle time, resolution rate and customer satisfaction using AI.

The AI used by the firm is a version of GPT which has received additional training on conversations between customers and agents. It is mostly used for two things: generating appropriate responses to customers in real-time and surfacing links to the firm’s technical documentation to help answer specific questions about the software.

Bear in mind that this generative AI system is meant to help the agents in performing their jobs. It is not intended to – and is not being trained to – completely replace them. They maintain autonomy in deciding whether and how much of the AI’s suggestions to take.

How Did Generative AI Change Work?

Next, we’ll look at what the study actually uncovered.

There were four main findings, touching on how total worker productivity was impacted, whether productivity gains accrued mainly to low-skill or high-skill workers, how access to an AI tool changed learning on the job, and how the organization changed as a result.

1. Access to Generative AI Boosted Worker Productivity

First, being able to use the firm’s AI tool increased worker productivity by almost 14%. This came from three sources: a reduction in how long it took any given agent to resolve a particular issue, an expansion in the total number of resolutions an agent was able to work on in an hour, and a small jump in the fraction of chats that were completed successfully.

The firm's AI tool increased worker productivity by almost 14%

This boost happened very quickly, showing up in the first month after deployment, growing a little in the second month, and then remaining at roughly that level for the duration of the study.

2. Access to Generative AI Was Most Helpful for Lower-Skilled Agents

Intriguingly, the greatest productivity gains were seen among agents that were relatively low-skill, such as those that were new to the job, with longer-serving, higher-skilled agents seeing virtually none.

The agents in the very bottom quintile for skill level, in fact, were able to resolve 35% more calls per hour—a substantial jump.

The agents in the very bottom quintile for skill level were able to resolve more calls per hour 35%.

With the benefit of hindsight it’s tempting to see these results as obvious, but they’re not. Earlier studies have usually found that the benefits of new computing technologies accrued to the ablest workers, or led firms to raise the bar on skill requirements for different positions.

If it’s true that generative AI is primarily going to benefit less able employees, this fact alone will distinguish it from prior waves of innovation. [1]

3. Access To Generative AI Helps New Workers “Move Down the Learning Curve”

Perhaps the most philosophically interesting conclusion drawn by the study’s authors relates to how generative AI is able to partially learn the tacit knowledge of more skilled workers.

The term “tacit knowledge” refers to the hard-to-articulate behaviors you pick up as you get good at something.

Imagine trying to teach a person how to ride a bike. It’s easy enough to give broad instructions (“check your shoelaces”, “don’t brake too hard”), but there ends up being a billion little subtleties related to foot placement, posture, etc. that are difficult to get into words.

This is true for everything, and it’s part of what distinguishes masters from novices. It’s also a major reason for the fact that many professions have been resistant to full automation.

Remember our discussion of how rule-based programming is poorly suited to tasks where the rules are hard to state? Well, that applies to tasks involving a lot of tacit knowledge. If no one, not even an expert, can tell you precisely what steps to take to replicate their results, then no one is going to be able to program a computer to do it either.

But ML and generative AI don’t face this restriction. With data sets that are big enough and rich enough, the algorithms might be able to capture some of the tacit knowledge expert contact center agents have, e.g. how they phrase replies to customers.

This is suggested by the study’s results. By analyzing the text of customer-agent interactions, the authors found that novice agents using generative AI were able to sound more like experienced agents, which contributed to their success.

4. Access to Generative AI Changed the Way the Organization Functioned

Organizations are profoundly shaped by their workers, and we should expect to see organization-level changes when a new technology dramatically changes how employees operate.

Two major findings from the study were that employee turnover was markedly reduced and there were far fewer customers “escalating” an issue by asking to speak to a supervisor. This could be because agents using generative AI were overall treated much better by customers (who have been known to become frustrated and irate), leading to less stress.

The Contact Center of the Future

Generative AI has already impacted many domains, and this trend will likely only continue going forward. “Generative AI At Work” provides a fascinating glimpse into the way that this technology changed a large contact center by boosting productivity among the least-skilled agents, helping disseminate the hard-won experience of the most-skilled agents, and overall reducing turnover and dissatisfaction.

If this piece has piqued your curiosity about how you can use advanced AI tools for customer-facing applications, schedule a demo of the Quiq conversational CX platform today.

From resolving customer complaints with chatbots to automated text-message follow-ups, we’ve worked hard to build a best-in-class solution for businesses that want to scale with AI.

Let’s see what we can do for you!

[1] See e.g. this quote: “Our paper is related to a large literature on the impact of various forms of technological adoption on worker productivity and the organization of work (e.g. Rosen, 1981; Autor et al., 1998; Athey and Stern, 2002; Bresnahan et al., 2002; Bartel et al., 2007; Acemoglu et al., 2007; Hoffman et al., 2017; Bloom et al., 2014; Michaels et al., 2014; Garicano and Rossi-Hansberg, 2015; Acemoglu and Restrepo, 2020). Many of these studies, particularly those focused on information technologies, find evidence that IT complements higher-skill workers (Akerman et al., 2015; Taniguchi and Yamada, 2022). Bartel et al. (2007) shows that firms that adopt IT tend to use more skilled labor and increase skill requirements for their workers. Acemoglu and Restrepo (2020) study the diffusion of robots and find that the negative employment effects of robots are most pronounced for workers in blue-collar occupations and those with less than a college education. In contrast, we study a different type of technology—generative AI—and find evidence that it most effectively augments lower-skill workers.”

A Guide to Fine-Tuning Pretrained Language Models for Specific Use Cases

Over the past half-year, large language models (LLMs) like ChatGPT have proven remarkably useful for a wide range of tasks, including machine translation, code analysis, and customer interactions in places like contact centers.

For all this power and flexibility, however, it is often still necessary to use fine-tuning to get an LLM to generate high-quality output for specific use cases.

Today, we’re going to do a deep dive into this process, understanding how these models work, what fine-tuning is, and how you can leverage it for your business.

What is a Pretrained Language Model?

First, let’s establish some background context by tackling the question of what pretrained models are and how they work.

The “GPT” in ChatGPT stands for “generative pretrained transformer”, and this gives us a clue as to what’s going on under the hood. ChatGPT is a generative model, meaning its purpose is to create new output; it’s pretrained, meaning that it has already seen a vast amount of text data by the time end users like us get our hands on it; and it’s a transformer, which refers to the fact that it’s built out of billions of transformer modules stacked into layers.

If you’re not conversant in the history of machine learning it can be difficult to see what the big deal is, but pretrained models are a relatively new development. Once upon a time in the ancient past (i.e. 15 or 20 years ago), it was an open question as to whether engineers would be able to pretrain a single model on a dataset and then fine-tune its performance, or whether they would need to approach each new problem by training a model from scratch.

This question was largely resolved around 2013, when image models trained on the ImageNet dataset began sweeping competitions left and right. Since then it has become more common to use pretrained models as a starting point, but we want to emphasize that this approach does not always work. There remain a vast number of important projects for which building a bespoke model is the only way to go.

What is Transfer Learning?

Transfer learning refers to when an agent or system figures out how to solve one kind of problem and then uses this knowledge to solve a different kind of problem. It’s a term that shows up all over artificial intelligence, cognitive psychology, and education theory.

Author, chess master, and martial artist Josh Waitzkin captures the idea nicely in the following passage from his blockbuster book, The Art of Learning:

“Since childhood I had treasured the sublime study of chess, the swim through ever-deepening layers of complexity. I could spend hours at a chessboard and stand up from the experience on fire with insight about chess, basketball, the ocean, psychology, love, art.”

Transfer learning is a broader concept than pretraining, but the two ideas are closely related. In machine learning, competence can be transferred from one domain (generating text) to another (translating between natural languages or creating Python code) by pretraining a sufficiently large model.

What is Fine-Tuning A Pretrained Language Model?

Fine-tuning a pretrained language model occurs when the model is repurposed for a particular task by being shown illustrations of the correct behavior.

If you’re in a whimsical mood, for example, you might give ChatGPT a few dozen limericks so that its future output always has that form.

It’s easy to confuse fine-tuning with a few other techniques for getting optimum performance out of LLMs, so it’s worth getting clear on terminology before we attempt to give a precise definition of fine-tuning.

Fine-Tuning a Language Model v.s. Zero-Shot Learning

Zero-shot learning is whatever you get out of a language model when you feed it a prompt without making any special effort to show it what you want. It’s not technically a form of fine-tuning at all, but it comes up in a lot of these conversations so it needs to be mentioned.

(NOTE: It is sometimes claimed that prompt engineering counts as zero-shot learning, and we’ll have more to say about that shortly.)

Fine-Tuning a Language Model v.s. One-Shot Learning

One-shot learning is showing a language model a single example of what you want it to do. Continuing our limerick example, one-shot learning would be giving the model one limerick and instructing it to format its replies with the same structure.

Fine-Tuning a Language Model v.s. Few-Shot Learning

Few-shot learning is more or less the same thing as one-shot learning, but you give the model several examples of how you want it to act.

How many counts as “several”? There’s no agreed-upon number that we know about, but probably 3 to 5, or perhaps as many as 10. More than this and you’re arguably not doing “few”-shot learning anymore.

Fine-Tuning a Language Model v.s. Prompt Engineering

Large language models like ChatGPT are stochastic and incredibly sensitive to the phrasing of the prompts they’re given. For this reason, it can take a while to develop a sense of how to feed the model instructions such that you get what you’re looking for.

The emerging discipline of prompt engineering is focused on cultivating this intuitive feel. Minor tweaks in word choice, sentence structure, etc. can have an enormous impact on the final output, and prompt engineers are those who have spent the time to learn how to make the most effective prompts (or are willing to just keep tinkering until the output is correct).

Does prompt engineering count as fine-tuning? We would argue that it doesn’t, primarily because we want to reserve the term “fine-tuning” for the more extensive process we describe in the next few sections.

Still, none of this is set in stone, and others might take the opposite view.

Distinguishing Fine-Tuning From Other Approaches

Having discussed prompt engineering and zero-, one-, and few-shot learning, we can give a fuller definition of fine-tuning.

Fine-tuning is taking a pretrained language model and optimizing it for a particular use case by giving it many examples to learn from. How many you ultimately need will depend a lot on your task – particularly how different the task is from the model’s training data and how strict your requirements for its output are – but you should expect it to take on the order of a few dozen or a few hundred examples.

Though it bears an obvious similarity to one-shot and few-shot learning, fine-tuning will generally require more work to come up with enough examples, and you might have to build a rudimentary pipeline that feeds the examples in through the API. It’s almost certainly not something you’ll be doing directly in the ChatGPT web interface.

Contact Us

How Can I Fine-Tune a Pretrained Language Model?

Having gotten this far, we can now turn our attention to what the fine-tuning procedure actually consists in. The basic steps are: deciding what you’re wanting to accomplish, gather the requisite data (and formatting it correctly), feeding it to your model, and evaluating the results.

Let’s discuss each, in turn.

Deciding on Your Use Case

The obvious place to begin is figuring out exactly what it is you want to fine-tune a pretrained model to do.

It may seem as though this is too obvious to be included as its own standalone step, but we’ve singled it out is because you need to think through the specifics of what you’re trying to accomplish. It’s not enough to say “We want to fine-tune this model to write tweets for us”, you have to consider questions like “Should the tone by formal or informal” and “Are we okay with it adding emojis”.

This matters because it informs the data that you gather for fine-tuning. If you feed the model irreverent tweets with lots of 🙂s and 🚀s, you can expect that to show up in the tweets it generates for you.

Picking a Base Model

Next, you have to figure out which model you want to fine-tune. This is not the place to get into a discussion on all the “foundation” models that are available today, but suffice it to say that HuggingFace, OpenAI, Anthropic, and the other companies in the pretrained model space offer a variety of different options. BigScience has BLOOM and T0pp models, OpenAI has Ada, Babbage, Da Vinci, and Curie, and so on.

Different models have different fine-tuning requirements, so this choice will impact how you structure your pipeline later.

Gathering and Formatting the Fine-Tuning Data

The heart of fine-tuning a pretrained model is gathering the samples from which it will learn. This is more or less exactly what it sounds like: you compile a list of prompts and desired responses. This is the data that will adjust the model’s internal parameters, optimizing it for your use case.

AI educator Dave Shapiro has a tutorial on fine-tuning GPT-3 to be a tutor for younger students, and here is one of the training samples he used (“TIM” is the name he’s using for the tutor):

User: Hey TIM, I’m interested in learning more about Charles Darwin. Can you tell me a bit about him?

TIM: Charles Darwin was an English naturalist and geologist who developed the theory of evolution by natural selection. He was born into a wealthy family and was able to attend the prestigious Shrewsbury School. After graduation, he enrolled at the University of Edinburgh to study medicine, but he soon dropped out and enrolled at Christ’s College, Cambridge.

User: Interesting! I didn’t know he was born into a wealthy family.

TIM: Yes, his family was very wealthy and his father was a doctor. Darwin’s grandfather, Erasmus Darwin, was a well-known scientist and wrote several books on botany and evolution.

User: Wow, I had no idea. What made Darwin decide to drop out of medical school and study at Cambridge?

TIM: Darwin was interested in natural history and had developed a keen interest in taxidermy. After dropping out of medical school, he traveled to Wales with some friends to study geology. It was during this time that he began to develop his theory of evolution.

One thing you’ll note is that this sample is pretty long. It could take a while to come up with hundreds of these to fine-tune your model, but that’s actually something a generative pretrained language model like ChatGPT can help with. Just be sure you have a human go over the samples and check them for accuracy, or you risk compromising the quality of your outputs.

Another thing to think about is how you’ll handle adversarial behavior and edge cases. If you’re training a conversational AI chatbot for a contact center, for example, you’ll want to include plenty of instances of the model calmly and politely responding to an irate customer. That way, your output will be similarly calm and polite.

Lastly, you’ll have to format the fine-tuning data according to whatever specifications are required by the base model you’re using. It’ll probably be something similar to JSON, but check the documentation to be sure.

Feeding it to Your Model

Now that you’ve got your samples ready, you’ll have to give them to the model for fine-tuning. This will involve you feeding the examples to the model via its API and waiting until the process has finished.

What is the Difference Between Fine-Tuning and a Pretrained Model?

A pretrained model is one that has been previously trained on a particular dataset or task, and fine-tuning is getting that model to do well on a new task by showing it examples of the output you want to see.

Pretrained models like ChatGPT are often pretty good out of the box, but if you’re wanting it to create legal contracts or work with highly-specialized scientific vocabulary, you’ll likely need to fine-tune it.

Should You Fine-Tune a Pretrained Model For Your Business?

Generative pretrained language models like ChatGPT and Bard have already begun to change the way businesses like contact centers function, and we think this is a trend that is likely to accelerate in the years ahead.

If you’ve been intrigued by the possibility of fine-tuning a pretrained model to supercharge your enterprise, then hopefully the information contained in this article gives you some ideas on how to begin.

Another option is to leverage the power of the Quiq platform. We’ve built a best-in-class conversational AI system that can automate substantial parts of your customer interactions (without you needing to run your own models or set up a fine-tuning pipeline.)

To see how we can help, schedule a demo with us today!

Request A Demo