Forrester Report: The State of Conversational AI Read the report —>

Why LLM Observability Matters (and Strategies for Getting it Right)

When integrating Large Language Models (LLMs) into applications, you can’t afford to treat them like “black boxes.” As your LLM application scales and becomes more complex, the need to monitor, troubleshoot, and understand how the LLM impacts your application becomes critical. In this article, we’ll explore the observability strategies we’ve found useful here at Quiq.

Key Elements of an Effective LLM Observability Strategy

  1. Provide Access: Encourage business users to engage actively in testing and optimization.
  2. Encourage Exploration: Make it easy to explore the application under different scenarios.
  3. Create Transparency: Clearly show how the model interacts within your application, reveal decision-making processes, system interactions, and how outputs are verified.
  4. Handle Errors Gracefully: Proactively identify and handle deviations or errors.
  5. Track System Performance: Expose metrics like response times, token usage, and errors.

LLMs add a layer of unpredictability and complexity to an application. Your observability tooling should allow you to actively explore both known and unknown issues while fostering an environment where engineers and business users can collaborate to create a new kind of application.

5 Strategies for LLM Observability

We will discuss strategies from the perspective of a real world event. An “event” triggers an application to process input and provides output back to the world.

A few examples of events include:

  • Chat user message input > Chat response
  • An email arriving into a ticketing system > Suggested reply
  • A case being closed > Case updated for topic or other classifications

You may have heard of these events referred to as prompt chains, prompt pipelines, agentic workflows, or conversational turns. The key takeaway; an event will require more than a single call to an LLM. Your LLM application’s job is to orchestrate LLM prompts, data requests, decisions and actions. The following strategies will help you understand what’s happening inside your LLM application.

1. Tracing Execution Paths

Any given event may follow different execution paths. Tracing the execution path should allow you to understand what state is set, which knowledge was retrieved, functions called, and generally how and why the LLM generated and verified the response. The ability to trace the execution path of an event will provide invaluable visibility into your application behavior.

For example, if your application delivers a message that offers a live agent; was it because the topic was sensitive, the user was frustrated or there was a gap in the knowledge resources? Tracing the execution path will help you pinpoint the prompt, knowledge or logic that drove the response. This is the first step in monitoring and optimizing an AI application. Your LLM observability should provide a full trace of the execution path that led to a response being delivered.

2. Replay Mechanisms for Faster Debugging

In real-world applications, being able to reproduce and fix errors quickly is critical. Implementing an event replay mechanism—where past events can be replayed against the current system configuration will provide a fast feedback loop.

Replaying events also helps when modifying prompts, upgrading models, adding knowledge or editing business rules. Changing your LLM application should be done in a controlled environment where you can replay events and ensure the desired effect without introducing new issues.

3. State Management & Monitoring

Another key aspect of LLM observability is capturing how your application’s field values or state changes during an event, as well as, across related events such as a conversation. Understanding the state of different variables can help you better understand and recreate the results of your LLM application.

Many use cases will also make use of memory. You should strive to manage this memory consistently and use caching for order or product info to reduce unnecessary network calls. In addition to data caches, multi-turn conversations may react differently based on the memory state. Suppose a user types “I need help” and you have implemented a next-best-action classifier with the following options:

  • Clarify the inquiry
  • Find Information
  • Escalate to live agent

The action taken may depend on whether “I need help” is the 1st or 5th message of the conversation. The response could also depend on whether the inquiry type is something you want your live agents handling.

The key takeaway – LLMs introduce a new kind of intelligence, but you’ll still need to manage state and domain specific logic to ensure your application is aware of its context. Clear visibility into the state of your application and your ability to reproduce it are vital parts of your observability strategy.

4. Claims Verification

A critical challenge with LLMs is ensuring the validity of the information they generate. Some refer to these made up answers as hallucinations. A hallucination is a statement made up by the LLM, usually because it makes semantic sense.

A claims verification process provides confidence that a response is grounded, attributable and verified by approved evidence from known knowledge or API resources. A dedicated verification model should be used to provide a confidence score and handling should be put in place to align answers that fail verification. The verification process should use metrics such as the maximum, minimum, and average scores and attribute answers to one or many resources.

For example:

  • On Verified: Define actions to take when a claim is verified. This could involve attributing the answer to one or many articles or API responses and then delivering a response to the end user.
  • On Unverified: Set workflows for unverified claims, such as retrying a prompt pipeline, aligning a corrective response, or escalating the issue to a human agent.

By integrating a claims verification model and process into your LLM application, you gain the ability to prevent hallucinations and attribute responses to known resources. This clear and traceable attribution will equip you with the information you need to field questions from stakeholders and provide insight into how you can improve your knowledge.

5. Regression Tests

After optimizing prompts, upgrading models, or introducing new knowledge; you’ll want to ensure that these changes don’t introduce new problems. Earlier, we talked about replaying events and this replay capability should be the basis for creating your test cases. You should be able to save any event as a regression test. Your test-sets should be run individually or in batch as part of a continuous integration pipeline.

The models are moving fast and your LLM application will be under constant pressure to get faster, smarter and cheaper. Test sets will give you the visibility and confidence you need to stay ahead of your competition.

Setting Performance Goals

While the above strategies are essential, it’s also important to evaluate how well your system is achieving its higher-level objectives. This is where performance goals come into play. Goals should be instrumented to track whether your application is successfully meeting the business objectives.

  • Goal Success: Measure how often your application achieves a defined objective, such as confirming an upcoming appointment, rendering an order status, or receiving positive user feedback.
  • Goal Failure: Track instances where the LLM fails to complete a task or requires human assistance.

Keep in mind that an event such as a live agent escalation could be considered success for one type of inquiry, and a failure in a different scenario. Goal instrumentation should provide a high degree of flexibility. By setting clear success and failure criteria for your application, you will be better positioned to evaluate its performance over time and identify areas for improvement.

Applying Segmentation to Hone In

Segmentation is a powerful tool for diving deeper into your LLM application’s performance. By grouping conversations or events based on specific criteria, such as inquiry type, user type or product category; you can focus your analysis on areas that matter most to your application.

For instance, you may want to segment conversations to see if your application behaves differently on web versus mobile, or across sales versus service inquiries. You can also create more complex segments that filter interactions based on specific events, such as when an error occurred or when a specific topic category was in play. Segmentation allows you to tailor your observability efforts to the use cases and specific needs of your business.

Using Funnels for Conversion and Performance Insights

Funnels provide another layer of insight by showing how users progress through a series of steps within a customer journey or conversation. A funnel allows you to visualize drop-offs, identify where users disengage, and track how many complete the intended goal. For example, you can track the steps a customer takes when engaging with your LLM application, from initial inquiry to task completion, and analyze where drop-offs occur.

Funnels can be segmented just like other data, allowing you to drill down by platform, customer type, or interaction type. This helps you understand where improvements are needed and how adjustments to prompts or knowledge bases can enhance the overall experience.

By combining segmentation with funnel analysis, you get a comprehensive view of your LLM’s effectiveness and can pinpoint specific areas for optimization.

A/B Testing for Continuous Improvement

A/B testing is a vital tool for systematically improving LLM application performance by comparing different versions of prompts, responses, or workflows. This method allows you to experiment with variations of the same interaction and measure which version produces better results. For instance, you can test two different prompts to see which one leads to more successful goal completions or fewer errors.

By running A/B tests, you can refine your prompt design, optimize the LLM’s decision-making logic, and improve overall user experience. The results of these tests give you data-backed insights, helping you implement changes with confidence that they’ll positively impact performance.

Additionally, A/B testing can be combined with funnel analysis, allowing you to track how changes affect customer behavior at each step of the journey. This ensures that your optimizations not only improve specific interactions but also lead to better conversion rates and task completions overall.

Final Thoughts on LLM Observability

LLM observability is not just a technical necessity but a strategic advantage. Whether you’re dealing with prompt optimization, function call validation, or auditing sensitive interactions, observability helps you maintain control over the outputs of your LLM application. By leveraging tools such as event debug-replay, regression tests, segmentation, funnel analysis, A/B testing, and claims verification, you will build trust that you have a safe and effective LLM application.

Curious about how Quiq approaches LLM observability? Get in touch with us.

Everything You Need to Know About LLM Integration

It’s hard to imagine an application, website or workflow that wouldn’t benefit in some way from the new electricity that is generative AI. But what does it look like to integrate an LLM into an application? Is it just a matter of hitting a REST API with some basic auth credentials, or is there more to it than that?

In this article, we’ll enumerate the things you should consider when planning an LLM integration.

Why Integrate an LLM?

At first glance, it might not seem like LLMs make sense for your application—and maybe they don’t. After all, is the ability to write a compelling poem about a lost Highland Cow named Bo actually useful in your context? Or perhaps you’re not working on anything that remotely resembles a chatbot. Do LLMs still make sense?

The important thing to know about ‘Generative AI’ is that it’s not just about generating creative content like poems or chat responses. Generative AI (LLMs) can be used to solve a bevy of other problems that roughly fall into three categories:

  1. Making decisions (classification)
  2. Transforming data
  3. Extracting information

Let’s use the example of an inbound email from a customer to your business. How might we use LLMs to streamline that experience?

  • Making Decisions
    • Is this email relevant to the business?
    • Is this email low, medium or high priority?
    • Does this email contain inappropriate content?
    • What person or department should this email be routed to?
  • Transforming data
    • Summarize the email for human handoff or record keeping
    • Redact offensive language from the email subject and body
  • Extracting information
    • Extract information such as a phone number, business name, job title etc from the email body to be used by other systems
  • Generating Responses
    • Generate a personalized, contextually-aware auto-response informing the customer that help is on the way
    • Alternatively, deploy a more sophisticated LLM flow (likely involving RAG) to directly address the customer’s need

It’s easy to see how solving these tasks would increase user satisfaction while also improving operational efficiency. All of these use cases are utilizing ‘Generative AI’, but some feel more generative than others.

When we consider decision making, data transformation and information extraction in addition to the more stereotypical generative AI use cases, it becomes harder to imagine a system that wouldn’t benefit from an LLM integration. Why? Because nearly all systems have some amount of human-generated ‘natural’ data (like text) that is no longer opaque in the age of LLMs.

Prior to LLMs, it was possible to solve most of the tasks listed above. But, it was exponentially harder. Let’s consider ‘is this email relevant to the business’. What would it have taken to solve this before LLMs?

  • A dataset of example emails labeled true if they’re relevant to the business and false if not (the bigger the better)
  • A training pipeline to produce a custom machine learning model for this task
  • Specialized hardware or cloud resources for training & inferencing
  • Data scientists, data curators, and Ops people to make it all happen

LLMs can solve many of these problems with radically lower effort and complexity, and they will often do a better job. With traditional machine learning models, your model is, at best, as good as the data you give it. With generative AI you can coach and refine the LLM’s behavior until it matches what you desire – regardless of historical data.

For these reasons LLMs are being deployed everywhere—and consumers’ expectations continue to rise.

How Do You Feel About LLM Vendor Lock-In?

Once you’ve decided to pursue an LLM integration, the first issue to consider is whether you’re comfortable with vendor lock-in. The LLM market is moving at lightspeed with the constant release of new models featuring new capabilities like function calls, multimodal prompting, and of course increased intelligence at higher speeds. Simultaneously, costs are plummeting. For this reason, it’s likely that your preferred LLM vendor today may not be your preferred vendor tomorrow.

Even at a fixed point in time, you may need more than a single LLM vendor.

In our recent experience, there are certain classification problems that Anthropic’s Claude does a better job of handling than comparable models from OpenAI. Similarly, we often prefer OpenAI models for truly generative tasks like generating responses. All of these LLM tasks might be in support of the same integration so you may want to look at the project not so much as integrating a single LLM or vendor, but rather a suite of tools.

If your use case is simple and low volume, a single vendor is probably fine. But if you plan to do anything moderately complex or high scale you should plan on integrating multiple LLM vendors to have access to the right models at the best price.

Resiliency & Scalability are Earned—Not Given

Making API calls to an LLM is trivial. Ensuring that your LLM integration is resilient and scalable requires more elbow grease. In fact, LLM API integrations pose unique challenges:

Challenge Solutions
They are pretty slow If your application is high-scale and you’re doing synchronous (threaded) network calls, your application won’t scale very well since most threads will be blocked on LLM calls. Consider switching to async I/O.

You’ll also want to support running multiple prompts in parallel to reduce visible latency to the user. 
They are throttled by requests per minute and tokens per minute Attempt to estimate your LLM usage in terms of requests and LLM tokens per minute and work with your provider(s) to ensure sufficient bandwidth for peak load 
They are (still) kinda flakey (unpredictable response times, unresponsive connections) Employ various retry schemes in response to timeouts, 500s, 429s (rate limit) etc.

The above remediations will help your application be scalable and resilient while your LLM service is up. But what if it’s down? If your LLM integration is on a critical execution path you’ll want to support automatic failover. Some LLMs are available from multiple providers:

  • OpenAI models are hosted by OpenAI itself as well as Azure
  • Anthropic models are hosted by Anthropic itself as well as AWS

Even if an LLM only has a single provider, or even if it has multiple, you can also provision the same logical LLM in multiple cloud regions to achieve a failover resource. Typically you’ll want the provider failover to be built into your retry scheme. Our failover mechanisms get tripped regularly out in production at Quiq, no doubt partially because of how rapidly the AI world is moving.

Are You Actually Building an Agentic Workflow?

Oftentimes you have a task that you know is well-suited for an LLM. For example, let’s say you’re planning to use an LLM to analyze the sentiment of product reviews. On the surface, this seems like a simple task that will require one LLM call that passes in the product review and asks the LLM to decide the sentiment. Will a single prompt suffice? What if we also want to determine if a given review contains profanity or personal information? What if we want to ask three LLMs and average their results?

Many tasks require multiple prompts, prompt chaining and possibly RAG (Retrieval Augmented Generation) to best solve a problem. Just like humans, AI produces better results when a problem is broken down into pieces. Such solutions are variously known as AI Agents, Agentic Workflows or Agent Networks and are why open source tools like LangChain were originally developed.

In our experience, pretty much every prompt eventually grows up to be an Agentic Workflow, which has interesting implications for how it’s configured & monitored.

Be Ready for the Snowball Effect

Introducing LLMs can result in a technological snowball effect, particularly if you need to use Retrieval Augmented Generation (RAG). LLMs are trained on mostly public data that was available at a fixed point in the past. If you want an LLM to behave in light of up-to-date and/or proprietary data sources (which most non-trivial applications do) you’ll need to do RAG.

RAG refers to retrieving the up-to-date and/or proprietary data you want the LLM to use in its decision making and passing it to the LLM as part of your prompt.

Assuming you need to search a reference dataset like a knowledge base, product catalog or product manual, the retrieval part of RAG typically entails adding the following entities to your system:

1. An embedding model

An embedding model is roughly half of an LLM – it does a great job of reading and understanding information you pass it but instead of generating a completion it produces a numeric vector that encodes its understanding of the source material.

You’ll typically run the embeddings model on all of the business data you want to search and retrieve for the LLM. Most LLM providers also have embedding models, or you can hit one via any major cloud.

2. A vector database

Once you have embeddings for all of your business data, you need to store them somewhere that facilitates speedy search based on numeric vectors. Solutions like Pinecone and MilvusDB fill this need, but that means integrating a new vendor or hosting a new database internally.

After implementing embeddings and a vector search solution, you can now retrieve information to include in the prompts you send to your LLM(s). But how can you trust that the LLM’s response is grounded in the information you provided and not something based on stale information or purely made up?

There are specialized deep learning models that exist solely for the purpose of ensuring that an LLM’s generative claims are grounded in facts you provide. This practice is variously referred to as hallucination detection, claim verification, NLI, etc. We believe NLI models are an essential part of a trustworthy RAG pipeline, but managed cloud solutions are scarce and you may need to host one yourself on GPU-enabled hardware.

Is a Black Box Sustainable?

If you bake your LLM integration directly into your app, you will effectively end up with a black box that can only be understood and improved by engineers. This could make sense if you have a decent size software shop and they’re the only folks likely to monitor or maintain the integration.

However, your best software engineers may not be your best (or most willing) prompt engineers, and you may wish to involve other personas like product and experience designers since an LLM’s output is often part of your application’s presentation layer & brand.

For these reasons, prompts will quickly need to move from code to configuration – no big deal. However, as an LLM integration matures it will likely become an Agentic Workflow involving:

  • More prompts, prompt parallelization & chaining
  • More prompt engineering
  • RAG and other orchestration

Moving these concerns into configuration is significantly more complex but necessary on larger projects. In addition, people will inevitably want to observe and understand the behavior of the integration to some degree.

For this reason it might make sense to embrace a visual framework for developing Agentic Workflows from the get-go. By doing so you open up the project to collaboration from non-engineers while promoting observability into the integration. If you don’t go this route be prepared to continually build out configurability and observability tools on the side.

Quiq’s AI Automations Take Care of LLM Integration Headaches For You

Hopefully we’ve given you a sense for what it takes to build an enterprise LLM integration. Now it’s time for the plug. The considerations outlined above are exactly why we built AI Studio and particularly our AI Automations product.

With AI automations you can create a serverless API that handles all the complexities of a fully orchestrated AI-flow, including support for multiple LLMs, chaining, RAG, resiliency, observability and more. With AI Automations your LLM integration can go back to being ‘just an API call with basic auth’.

Want to learn more? Dive into AI Studio or reach out to our team.

Request A Demo

The Truth About APIs for AI: What You Need to Know

Large language models hold a lot of power to improve your customer experience and make your agents more effective, but they won’t do you much good if you don’t have a way to actually access them.

This is where application programming interfaces (APIs) come into play. If you want to leverage LLMs, you’ll either have to build one in-house, use an AI API deployment to interact with an external model, or go with a customer-centric AI for CX platform. The latter choice is most ideal because it offers a guided building environment that removes complexity while providing the tools you need for scalability, observability, hallucination prevention, and more.

From a cost and ease-of-use perspective this third option is almost always best, but there are many misconceptions that could potentially stand in the way of AI API adoption.

In fact, a stronger claim is warranted: to maximize AI API effectiveness, you need a platform to orchestrate between AI, your business logic, and the rest of your CX stack.

Otherwise, it’s useless.

This article aims to bridge the gap between what CX leaders might think is required to integrate a platform, and what’s actually involved. By the end, you’ll understand what APIs are, their role in personalization and scalability, and why they work best in the context of a customer-centric AI for CX platform.

How APIs Facilitate Access to AI Capabilities

Let’s start by defining an API. As the name suggests, APIs are essentially structured protocols that allow two systems (“applications”) to communicate with one another (“interface”). For instance, if you’re using a third-party CRM to track your contacts, you’ll probably update it through an API.

All the well-known foundation model providers (e.g., OpenAI, Anthropic, etc.) have a real-world AI API implementation that allows you to use their service. For an AI API practical example, let’s look at OpenAI’s documentation:

(Let’s take a second to understand what we’re looking at. Don’t worry – we’ll break it down for you. Understanding the basics will give you a sense for what your engineers will be doing.)

The top line points us to a URL where we can access OpenAI’s models, and the next three lines require us to pass in an API key (which is kind of like a password giving access to the platform), our organization ID (a unique designator for our particular company, not unlike a username), and a project ID (a way to refer to this specific project, useful if you’re working on a few different projects at once).

This is only one example, but you can reasonably assume that most protocols built according to AI API best practices will have a similar structure.

This alone isn’t enough to support most AI API use cases, but it illustrates the key takeaway of this section: APIs are attractive because they make it easy to access the capabilities of LLMs without needing to manage them on your own infrastructure, though they’re still best when used as part of a move to a customer-centric AI orchestration platform.

How Do APIs Facilitate Customer Support AI Assistants?

It’s good to understand what APIs are used for in AI assistants. It’s pretty straightforward—here’s the bulk of it:

  • Personalizing customer communications: One of the most exciting real-world benefits of AI is that it enables personalization at scale because you can integrate an LLM with trusted systems containing customer profiles, transaction data, etc., which can be incorporated into a model’s reply. So, for example, when a customer asks for shipping information, you’re not limited to generic responses like “your item will be shipped within 3 days of your order date.” Instead, you can take a more customer-centric approach and offer specific details, such as, “The order for your new couch was placed on Monday, and will be sent out on Wednesday. According to your location, we expect that it’ll arrive by Friday. Would you like to select a delivery window or upgrade to white glove service?”
  • Improving response quality: Generative AI is plagued by a tendency to fabricate information. With an AI API, work can be decomposed into smaller, concrete tasks before being passed to an LLM, which improves performance. You can also do other things to get better outputs, such as create bespoke modifications of the prompt that change the model’s tone, the length of its reply, etc.
  • Scalability and flexibility in deployment: A good customer-centric, AI-for-CX platform will offer volume-based pricing, meaning you can scale up or down as needed. If customer issues are coming in thick and fast (such as might occur during a new product release, or over a holiday), just keep passing them to the API while paying a bit more for the increased load; if things are quiet because it’s 2 a.m., the API just sits there, waiting to spring into action when required and costing you very little.
  • Analyzing customer feedback and sentiment: Incredible insights are waiting within your spreadsheets and databases, if you only know how to find them. This, too, is something APIs help with. If, for example, you need to unify measurements across your organization to send them to a VOC (voice of customer) platform, you can do that with an API.

Looking Beyond an API for AI Assistants

For all this, it’s worth pointing out that there’s still many real-world AI API challenges. By far the quickest way to begin building an AI assistant for CX is to pair with a customer-centric AI platform that removes as much of the difficulty as possible.

The best such platforms not only allow you to utilize a bevy of underlying LLM models, they also facilitate gathering and analyzing data, monitoring and supporting your agents, and automating substantial parts of your workflow.

Crucially, almost all of those critical tasks are facilitated through APIs, but they can be united in a good platform.

3 Common Misconceptions about Customer-Centric AI for CX Platforms.

Now, let’s address some of the biggest myths surrounding the use of AI orchestration platforms.

Myth 1: Working with a customer-centric AI for CX Platform Will be a Hassle

Some CX leaders may worry that working with a platform will be too difficult. There are challenges, to be sure, but a well-designed platform with an intuitive user interface is easy to slip into a broader engineering project.

Such platforms are designed to support easy integration with existing systems, and they generally have ample documentation available to make this task as straightforward as possible.

Myth 2: AI Platforms Cost Too Much

Another concern CX leaders have is the cost of using an AI orchestration platform. Platform costs can add up over time, but this pales in comparison to the cost of building in-house solutions. Not to mention the potential costs associated with the risks that come with building AI in an environment that doesn’t protect you from things like hallucinations.

When you weigh all the factors impacting your decision to use AI in your contact center, the long-run return on using an AI orchestration platform is almost always better.

Myth 3: Customer-Centric AI Platforms are Just Too Insecure

The smart CX leader always has one eye on the overall security of their enterprise, so they may be worried about vulnerabilities introduced by using an AI platform.

This is a perfectly reasonable concern. If you’re trying to choose between a few different providers, it’s worth investigating the security measures they’ve implemented. Specifically, you want to figure out what data encryption and protection protocols they use, and how they think about compliance with industry standards and regulations.

At a minimum, the provider should be taking basic steps to make sure data transmitted to the platform isn’t exposed.

Is an AI Platform Right for Me?

With a platform focused on optimizing CX outcomes, you can quickly bring the awesome power and flexibility of generative AI into your contact center – without ever spinning up a server or fretting over what “backpropagation” means. To the best of our knowledge, this is the cheapest and fastest way to demo this API technology in your workflow to determine whether it warrants a deeper investment.

To parse out more generative AI facts from fiction, download our e-book on AI misconceptions and how to overcome them. If you’re concerned about hallucinations, data privacy, and similar issues, you won’t find a better one-stop read!

Request A Demo

What is an AI Assistant for Retail?

Over the past few months, we’ve had a lot to say about artificial intelligence, its new frontiers, and the ways in which it is changing the customer service industry.

A natural extension of this analysis is looking at the use of AI in retail. That is our mission today. We’ll look at how techniques like natural language processing and computer vision will impact retail, along with some of the benefits and challenges of this approach.

Let’s get going!

How is AI Used in Retail?

AI is poised to change retail, as it is changing many other industries. In the sections that follow, we’ll talk through three primary AI technologies that are driving these changes, namely natural language processing, computer vision, and machine learning more broadly.

Natural Language Processing

Natural language processing (NLP) refers to a branch of machine learning that attempts to work with spoken or written language algorithmically. Together with computer vision, it is one of the best-researched and most successful attempts to advance AI since the field was founded some seven decades ago.

Of course, these days the main NLP applications everyone has heard of are large language models like ChatGPT. This is not the only way AI assistants will change retail, but it is a big one, so that’s where we’ll start.

An obvious place to use LLMs in retail is with chatbots. There’s a lot of customer interaction that involves very specific questions that need to be handled by a human customer service agent, but a lot of it is fairly banal, consisting of things like “How do I return this item” or “Can you help me unlock my account.” For these sorts of issues, today’s chatbots are already powerful enough to help in most situations.

A related use case for AI in retail is asking questions about specific items. A customer might want to know what fabric an article of clothing is made out of or how it should be cleaned, for example. An out-of-the-box model like ChatGPT won’t be able to help much. but if you’ve used a service like Quiq’s conversational CX platform, it’s possible to finetune an LLM on your specific documentation. Such a model will be able to help customers find the answers they need.

These use cases are all centered around text-based interactions, but algorithms are getting better and better at both speech recognition and speech synthesis. You’ve no doubt had the distinct (dis)pleasure of interacting with an automated system that sounded very artificial and that lacked the flexibility actually to help you very much; but someday soon, you may not be able to tell from a short conversation whether you were talking to a human or a machine.

This may cause a certain amount of concern over technological unemployment. If chatbots and similar AI assistants are doing all this, what will be left for flesh-and-blood human workers? Frankly, it’s too early to say, but the evidence so far suggests that not only is AI not making us obsolete, it’s actually making workers more productive and less prone to burnout.

Computer Vision

Computer vision is the other major triumph of machine learning. CV algorithms have been created that can recognize faces, recognize thousands of different types of objects, and even help with steering autonomous vehicles.

How does any of this help with retail?

We already hinted at one use case in the previous paragraph, i.e. automatically identifying different items. This has major implications for inventory management, but when paired with technologies like virtual reality and augmented reality, it could completely transform the ways in which people shop.

Many platforms already offer the ability to see furniture and similar items in a customer’s actual living space, and there are efforts underway to build tools for automatically sizing them so they know exactly which clothes to try on.

CV is also making it easier to gather and analyze different metrics crucial to a retail enterprise’s success. Algorithms can watch customer foot traffic to identify potential hotspots, meaning that these businesses can figure out which items to offer more of and which to cut altogether.

Machine Learning

As we stated earlier, both natural language processing and computer vision are types of machine learning. We gave them their own sections because they’re so big and important, but they’re not the only ways in which machine learning will impact retail.

Another way is with increasingly personalized recommendations. If you’ve ever taken the advice of Netflix or Spotify as to what entertainment you should consume next then you’ve already made contact with a recommendation engine. But with more data and smarter algorithms, personalization will become much more, well, personalized.

In concrete terms, this means it will become easier and easier to analyze a customer’s past buying history to offer them tailor-made solutions to their problems. Retail is all about consumer satisfaction, so this is poised to be a major development.

Machine learning has long been used for inventory management, demand forecasting, etc., and the role it plays in these efforts will only grow with time. Having more data will mean being able to make more fine-grained predictions. You’ll be able to start printing Taylor Swift t-shirts and setting up targeted ads as soon as people in your area begin buying tickets to her show next month, for example.

Where are AI Assistants Used in Retail?

So far, we’ve spoken in broad terms about the ways in which AI assistants will be used in retail. In these sections, we’ll get more specific and discuss some of the particular locations where these assistants can be deployed.

In Kiosks

Many retail establishments already have kiosks in place that let you swap change for dollars or skip the trip to the DMV. With AI, these will become far more adaptable and useful, able to help customers with a greater variety of transactions.

In Retail Apps

Mobile applications are an obvious place to use recommendations or LLM-based chatbots to help make a sale or get customers what they need.

In Smart Speakers

You’ve probably heard of Alexa, a smart speaker able to play music for you or automate certain household tasks. Well, it isn’t hard to imagine their use in retail, especially as they get better. They’ll be able to help customers choose clothing, handle returns, or do any of a number of related tasks.

In Smart Mirrors

For more or less the same reason, AI-powered smart mirrors could have a major impact on retail. As computer vision improves it’ll be better able to suggest clothing that looks good on different heights and builds, for example.

What are the Benefits of Using AI in Retail?

The main reason that AI is being used more frequently in retail is that there are so many advantages to this approach. In the next few sections, we’ll talk about some of the specific benefits retail establishments can expect to enjoy from their use of AI.

Better Customer Experience and Engagement

These days, there are tons of ways to get access to the goods and services you need. What tends to separate one retail establishment from another is customer experience and customer engagement. AI can help with both.

We’ve already mentioned how much more personalized AI can make the customer experience, but you might also consider the impact of round-the-clock availability that AI makes possible.

Customer service agents will need to eat and sleep sometimes, but AI never will, which means that it’ll always be available to help a customer solve their problems.

More Selling Opportunities

Cross-selling and upselling are both terms that are probably familiar to you, and they represent substantial opportunities for retail outfits to boost their revenue.

With personalized recommendations, sentiment analysis, and similar machine-learning techniques, it will become much faster and easier to identify additional items that a customer might be interested in.

If a customer has already bought Taylor Swift tickets and a t-shirt, for example, perhaps they’d also like a fetching hat that goes along with their outfit. And if you’ve installed the smart mirrors we talked about earlier, AI will even be able to help them find the right size.

Leaner, More Efficient Operations

Inventory management is a never-ending concern in retail. It’s also one place where algorithmic solutions have been used for a long time. We think this trend will only continue, with operations becoming leaner and more responsive to changing market conditions.

All of this ultimately hinges on the use of AI. Better algorithms and more comprehensive data will make it possible to predict what people will want and when, meaning you don’t have to sit on inventory you don’t need and are less likely to run out of anything that’s selling well.

What are the Challenges of Using AI in Retail?

That being said, there are many challenges to using Artificial Intelligence in retail. We’ll cover a few of these now so you can decide how much effort you want to put into using AI.

AI Can Still Be Difficult to Use

To be sure, firing up ChatGPT and asking it to recommend an outfit for a concert doesn’t take very long. But this is a far cry from implementing a full-bore AI solution into your website or mobile applications. Serious technical expertise is required to train, finetune, deploy, and monitor advanced AI, whether that’s an LLM, a computer-vision system, or anything else, and you’ll need to decide whether you think you’ll get enough return to justify the investment.

Expense

And speaking of investment, it remains pretty expensive to utilize AI at any non-trivial scale. If you decide you want to hire an in-house engineering team to build a bespoke model, you’ll have to have a substantial budget to pay for the training and the engineer’s salaries. These salaries are still something you’ll have to account for even if you choose to build on top of an existing solution, because finetuning a model is far from easy.

One solution is to utilize an offering like Quiq. We have already created the custom infrastructure required to utilize AI in a retail setting, meaning you wouldn’t need a serious engineering force to get going with AI.

Bias, Abuse, and Toxicity

A perennial concern with using AI is that a model will generate output that is insulting, harmful, or biased in some way. For obvious reasons this is bad for retail establishments, so you’ll want to make sure that you both carefully finetune this behavior out of your models and continually monitor them in case their behavior changes in the future. Quiq also eliminates this risk.

AI and the Future of Retail

Artificial intelligence has long been expected to change many aspects of our lives, and in the past few years, it has begun delivering on that promise. From ultra-precise recommendations to full-fledged chatbots that help resolve complex issues, retail stands to benefit greatly from this ongoing revolution.

If you want to get in on the action but don’t know where to start, set up a time to check out the Quiq platform. We make it easy to utilize both customer-facing and agent-facing solutions, so you can build an AI-positive business without worrying about the engineering.

Request A Demo

What are the Biggest Questions About AI?

The term “artificial intelligence” was coined at the famous Dartmouth Conference in 1956, put on by luminaries like John McCarthy, Marvin Minsky, and Claude Shannon, among others.

These organizers wanted to create machines that “use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.” They went on to claim that “…a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.”

Half a century later, it’s fair to say that this has not come to pass; brilliant as they were, it would seem as though McCarthy et al. underestimated how difficult it would be to scale the heights of the human intellect.

Nevertheless, remarkable advances have been made over the past decade, so much so that they’ve ignited a firestorm of controversy around this technology. People are questioning the ways in which it can be used negatively, and whether it might ultimately pose an extinction risk to humanity; they’re probing fundamental issues around whether machines can be conscious, exercise free will, and think in the way a living organism does; they’re rethinking the basis of intelligence, concept formation, and what it means to be human.

These are deep waters to be sure, and we’re not going to swim them all today. But as contact center managers and others begin the process of thinking about using AI, it’s worth being at least aware of what this broader conversation is about. It will likely come up in meetings, in the press, or in Slack channels in exchanges between employees.

And that’s the subject of our piece today. We’re going to start by asking what artificial intelligence is and how it’s being used, before turning to address some of the concerns about its long-term potential. Our goal is not to answer all these concerns, but to make you aware of what people are thinking and saying.

What is Artificial Intelligence?

Artificial intelligence is famous for having had many, many definitions. There are those, for example, who believe that in order to be intelligent computers must think like humans, and those who reply that we didn’t make airplanes by designing them to fly like birds.

For our part, we prefer to sidestep the question somewhat by utilizing the approach taken in one of the leading textbooks in the field, Stuart Russell and Peter Norvig’s “Artificial Intelligence: A Modern Approach”.

They propose a multi-part system for thinking about different approaches to AI. One set of approaches is human-centric and focuses on designing machines that either think like humans – i.e., engage in analogous cognitive and perceptual processes – or act like humans – i.e. by behaving in a way that’s indistinguishable from a human, regardless of what’s happening under the hood (think: the Turing Test).

The other set of approaches is ideal-centric and focuses on designing machines that either think in a totally rational way – conformant with the rules of Bayesian epistemology, for example – or behave in a totally rational way – utilizing logic and probability, but also acting instinctively to remove itself from danger, without going through any lengthy calculations.

What we have here, in other words, is a framework. Using the framework not only gives us a way to think about almost every AI project in existence, it also saves us from needing to spend all weekend coming up with a clever new definition of AI.

Joking aside, we think this is a productive lens through which to view the whole debate, and we offer it here for your information.

What is Artificial Intelligence Good For?

Given all the hype around ChatGPT, this might seem like a quaint question. But not that long ago, many people were asking it in earnest. The basic insights upon which large language models like ChatGPT are built go back to the 1960s, but it wasn’t until 1) vast quantities of data became available, and 2) compute cycles became extremely cheap that much of its potential was realized.

Today, large language models are changing (or poised to change) many different fields. Our audience is focused on contact centers, so that’s what we’ll focus on as well.

There are a number of ways that generative AI is changing contact centers. Because of its remarkable abilities with natural language, it’s able to dramatically speed up agents in their work by answering questions and formatting replies. These same abilities allow it to handle other important tasks, like summarizing articles and documentation and parsing the sentiment in customer messages to enable semi-automated prioritization of their requests.

Though we’re still in the early days, the evidence so far suggests that large language models like Quiq’s conversational CX platform will do a lot to increase the efficiency of contact center agents.

Will AI be Dangerous?

One thing that’s burst into public imagination recently has been the debate around the risks of artificial intelligence, which fall into two broad categories.

The first category is what we’ll call “social and political risks”. These are the risks that large language models will make it dramatically easier to manufacture propaganda at scale, and perhaps tailor it to specific audiences or even individuals. When combined with the astonishing progress in deepfakes, it’s not hard to see how there could be real issues in the future. Most people (including us) are poorly equipped to figure out when a video is fake, and if the underlying technology gets much better, there may come a day when it’s simply not possible to tell.

Political operatives are already quite skilled at cherry-picking quotes and stitching together soundbites into a damning portrait of a candidate – imagine what’ll be possible when they don’t even need to bother.

But the bigger (and more speculative) danger is around really advanced artificial intelligence. Because this case is harder to understand, it’s what we’ll spend the rest of this section on.

Artificial Superintelligence and Existential Risk

As we understand it, the basic case for existential risk from artificial intelligence goes something like this:

“Someday soon, humanity will build or grow an artificial general intelligence (AGI). It’s going to want things, which means that it’ll be steering the world in the direction of achieving its ambitions. Because it’s smart, it’ll do this quite well, and because it’s a very alien sort of mind, it’ll be making moves that are hard for us to predict or understand. Unless we solve some major technological problems around how to design reward structures and goal architectures in advanced agentive systems, what it wants will almost certainly conflict in subtle ways with what we want. If all this happens, we’ll find ourselves in conflict with an opponent unlike any we’ve faced in the history of our species, and it’s not at all clear we’ll prevail.”

This is heady stuff, so let’s unpack it bit by bit. The opening sentence, “…humanity will build or grow an artificial general intelligence”, was chosen carefully. If you understand how LLMs and deep learning systems are trained, the process is more akin to growing an enormous structure than it is to building one.

This has a few implications. First, their internal workings remain almost completely inscrutable. Though researchers in fields like mechanistic interpretability are going a long way toward unpacking how neural networks function, the truth is, we’ve still got a long way to go.

What this means is that we’ve built one of the most powerful artifacts in the history of Earth, and no one is really sure how it works.

Another implication is that no one has any good theoretical or empirical reason to bound the capabilities and behavior of future systems. The leap from GPT-2 to GPT-3.5 was astonishing, as was the leap from GPT-3.5 to GPT-4. The basic approach so far has been to throw more data and more compute at the training algorithms; it’s possible that this paradigm will begin to level off soon, but it’s also possible that it won’t. If the gap between GPT-4 and GPT-5 is as big as the gap between GPT-3 and GPT-4, and if the gap between GPT-6 and GPT-5 is just as big, it’s not hard to see that the consequences could be staggering.

As things stand, it’s anyone’s guess how this will play out. But that’s not necessarily a comforting thought.

Next, let’s talk about pointing a system at a task. Does ChatGPT want anything? The short answer is: as far as we can tell, it doesn’t. ChatGPT isn’t an agent, in the sense that it’s trying to achieve something in the world, but work into agentive systems is ongoing. Remember that 10 years ago most neural networks were basically toys, and today we have ChatGPT. If breakthroughs in agency follow a similar pace (and they very well may not), then we could have systems able to pursue open-ended courses of action in the real world in relatively short order.

Another sobering possibility is that this capacity will simply emerge from the training of huge deep learning systems. This is, after all, the way human agency emerged in the first place. Through the relentless grind of natural selection, our ancestors went from chipping flint arrowheads to industrialization, quantum computing, and synthetic biology.

To be clear, this is far from a foregone conclusion, as the algorithms used to train large language models is quite different from natural selection. Still, we want to relay this line of argumentation, because it comes up a lot in these discussions.

Finally, we’ll address one more important claim, “…what it wants will almost certainly conflict in subtle ways with what we want.” Why think this is true? Aren’t these systems that we design and, if so, can’t we just tell it what we want it to go after?

Unfortunately, it’s not so simple. Whether you’re talking about reinforcement learning or something more exotic like evolutionary programming, the simple fact is that our algorithms often find remarkable mechanisms by which to maximize their reward in ways we didn’t intend.

There are thousands of examples of this (ask any reinforcement-learning engineer you know), but a famous one comes from the classic Coast Runners video game. The engineers who built the system tried to set up the algorithm’s rewards so that it would try to race a boat as well as it could. What it actually did, however, was maximize its reward by spinning in a circle to hit a set of green blocks over and over again.

biggest questions about AI

Now, this may seem almost silly – do we really have anything to fear from an algorithm too stupid to understand the concept of a “race”?

But this would be missing the thrust of the argument. If you had access to a superintelligent AI and asked it to maximize human happiness, what happened next would depend almost entirely on what it understood “happiness” to mean.

If it were properly designed, it would work in tandem with us to usher in a utopia. But if it understood it to mean “maximize the number of smiles”, it would be incentivized to start paying people to get plastic surgery to fix their faces into permanent smiles (or something similarly unintuitive).

Does AI Pose an Existential Risk?

Above, we’ve briefly outlined the case that sufficiently advanced AI could pose a serious risk to humanity by being powerful, unpredictable, and prone to pursuing goals that weren’t-quite-what-we-meant.

So, does this hold water? Honestly, it’s too early to tell. The argument has hundreds of moving parts, some well-established and others much more speculative. Our purpose here isn’t to come down on one side of this debate or the other, but to let you know (in broad strokes) what people are saying.

At any rate, we are confident that the current version of ChatGPT doesn’t pose any existential risks. On the contrary, it could end up being one of the greatest advancements in productivity ever seen in contact centers. And that’s what we’d like to discuss in the next section.

Will AI Take All the Jobs?

The concern that someday a new technology will render human labor obsolete is hardly new. It was heard when mechanized weaving machines were created, when computers emerged, when the internet emerged, and when ChatGPT came onto the scene.

We’re not economists and we’re not qualified to take a definitive stand, but we do have some early evidence that is showing that large language models are not only not resulting in layoffs, they’re making agents much more productive.

Erik Brynjolfsson, Danielle Li, and Lindsey R. Raymond, three MIT economists, looked at the ways in which generative AI was being used in a large contact center. They found that it was actually doing a good job of internalizing the ways in which senior agents were doing their jobs, which allowed more junior agents to climb the learning curve more quickly and perform at a much higher level. This had the knock-on effect of making them feel less stressed about their work, thus reducing turnover.

Now, this doesn’t rule out the possibility that GPT-10 will be the big job killer. But so far, large language models are shaping up to be like every prior technological advance, i.e., increasing employment rather than reducing it.

What is the Future of AI?

The rise of AI is raising stock valuations, raising deep philosophical questions, and raising expectations and fears about the future. We don’t know for sure how all this will play out, but we do know contact centers, and we know that they stand to benefit greatly from the current iteration of large language models.

These tools are helping agents answer more queries per hour, do so more thoroughly, and make for a better customer experience in the process.

If you want to get in on the action, set up a demo of our technology today.

Request A Demo

What is Sentiment Analysis? – Ultimate Guide

A person only reaches out to a contact center when they’re having an issue. They can’t get a product to work the way they need it to, for example, or they’ve been locked out of their account.

The chances are high that they’re frustrated, angry, or otherwise in an emotionally-fraught state, and this is something contact center agents must understand and contend with.

The term “sentiment analysis” refers to the field of machine learning which focuses on developing algorithmic ways of detecting emotions in natural-language text, such as the messages exchanged between a customer and a contact center agent.

Making it easier to detect, classify, and prioritize messages on the basis of their sentiment is just one of many ways that technology is revolutionizing contact centers, and it’s the subject we’ll be addressing today.

Let’s get started!

What is Sentiment Analysis?

Sentiment analysis involves using various approaches to natural language processing to identify the overall “sentiment” of a piece of text.

Take these three examples:

  1. “This restaurant is amazing. The wait staff were friendly, the food was top-notch, and we had a magnificent view of the famous New York skyline. Highly recommended.”
  2. “Root canals are never fun, but it certainly doesn’t help when you have to deal with a dentist as unprofessional and rude as Dr. Thomas.”
  3. “Toronto’s forecast for today is a high of 75 and a low of 61 degrees.”

Humans excel at detecting emotions, and it’s probably not hard for you to see that the first example is positive, the second is negative, and the third is neutral (depending on how you like your weather.)

There’s a greater challenge, however, in getting machines to make accurate classifications of this kind of data. How exactly that’s accomplished is the subject of the next section, but before we get to that, let’s talk about a few flavors of sentiment analysis.

What Types of Sentiment Analysis Are There?

It’s worth understanding the different approaches to sentiment analysis if you’re considering using it in your contact center.

Above, we provided an example of positive, negative, and neutral text. What we’re doing there is detecting the polarity of the text, and as you may have guessed, it’s possible to make much more fine-grained delineations of textual data.

Rather than simply detecting whether text is positive or negative, for example, we might instead use these categories: very positive, positive, neutral, negative, and very negative.

This would give us a better understanding of the message we’re looking at, and how it should be handled.

Instead of classifying text by its polarity, we might also use sentiment analysis to detect the emotions being communicated – rather than classifying a sentence as being “positive” or “negative”, in other words, we’d identify emotions like “anger” or “joy” contained in our textual data.

This is called “emotion detection” (appropriately enough), and it can be handled with long short-term memory (LSTM) or convolutional neural network (CNN) models.

Another, more granular approach to sentiment analysis is known as aspect-based sentiment analysis. It involves two basic steps: identifying “aspects” of a piece of text, then identifying the sentiment attached to each aspect.

Take the sentence “I love the zoo, but I hate the lines and the monkeys make fun of me.” It’s hard to assign an overall sentiment to the sentence – it’s generally positive, but there’s kind of a lot going on.

If we break out the “zoo”, “lines”, and “monkeys” aspects, however, we can see that there’s the positive sentiment attached to the zoo, and negative sentiment attached to the lines and the abusive monkeys.

Why is Sentiment Analysis Important?

It’s easy to see how aspect-based sentiment analysis would inform marketing efforts. With a good enough model, you’d be able to see precisely which parts of your offering your clients appreciate, and which parts they don’t. This would give you valuable information in crafting a strategy going forward.

This is true of sentiment analysis more broadly, and of emotion detection too.
You need to know what people are thinking, saying, and feeling about you and your company if you’re going to meet their needs well enough to make a profit.

Once upon a time, the only way to get these data was with focus groups and surveys. Those are still utilized, of course. But in the social media era, people are also not shy about sharing their opinions online, in forums, and similar outlets.

These oceans of words from an invaluable resource if you know how to mine them. When done correctly, sentiment analysis offers just the right set of tools for doing this at scale.

Challenges with Sentiment Analysis

Sentiment analysis confers many advantages, but it is not without its challenges. Most of these issues boil down to handling subtleties or ambiguities in language.

Consider a sentence like “This is a remarkable product, but still not worth it at that price.” Calling a product “remarkable” is a glowing endorsement, tempered somewhat by the claim that its price is set too high. Most basic sentiment classifiers would probably call this “positive”, but as you can see, there are important nuances.

Another issue is sarcasm.

Suppose we showed you a sentence like “This movie was just great, I loved spending three hours of my Sunday afternoon following a story that could’ve been told in twenty minutes.”

A sentiment analysis algorithm is likely going to pick up on “great” and “loved” when calling this sentence positive.

But, as humans, we know that these are backhanded compliments meant to communicate precisely the opposite message.

Machine-learning systems will also tend to struggle with idioms that we all find easy to parse, such as “Setting up my home security system was a piece of cake.” This is positive because “piece of cake” means something like “couldn’t have been easier”, but an algorithm may or may not pick up on that.

Finally, we’ll mention the fact that much of the text in product reviews will contain useful information that doesn’t fit easily into a “sentiment” bucket. Take a sentence like “The new iPhone is smaller than the new Android.” This is just a bare statement of physical facts, and whether it counts as positive or negative depends a lot on what a given customer is looking for.

There are various ways of trying to ameliorate these issues, most of which are outside the scope of this article. For now, we’ll just note that sentiment analysis needs to be approached carefully if you want to glean an accurate picture of how people feel about your offering from their textual reviews. So long as you’re diligent about inspecting the data you show the system and are cautious in how you interpret the results, you’ll probably be fine.

Two people review data on a paper and computer to anticipate customer needs.

How Does Sentiment Analysis Work?

Now that we’ve laid out a definition of sentiment analysis, talked through a few examples, and made it clear why it’s so important, let’s discuss the nuts and bolts of how it works.

Sentiment analysis begins where all data science and machine learning projects begin: with data. Because sentiment analysis is based on textual data, you’ll need to utilize various techniques for preprocessing NLP data. Specifically, you’ll need to:

  • Tokenize the data by breaking sentences up into individual units an algorithm can process;
  • Use either stemming or lemmatization to turn words into their root form, i.e. by turning “ran” into “run”;
  • Filter out stop words like “the” or “as”, because they don’t add much to the text data.

Once that’s done, there are two basic approaches to sentiment analysis. The first is known as “rule-based” analysis. It involves taking your preprocessed textual data and comparing it against a pre-defined lexicon of words that have been tagged for sentiment.

If the word “happy” appears in your text it’ll be labeled “positive”, for example, and if the word “difficult” appears in your text it’ll be labeled “negative.”

(Rules-based sentiment analysis is more nuanced than what we’ve indicated here, but this is the basic idea.)

The second approach is based on machine learning. A sentiment analysis algorithm will be shown many examples of labeled sentiment data, from which it will learn a pattern that can be applied to new data the algorithm has never seen before.

Of course, there are tradeoffs to both approaches. The rules-based approach is relatively straightforward, but is unlikely to be able to handle the sorts of subtleties that a really good machine-learning system can parse.

Though machine learning is more powerful, however, it’ll only be as good as the training data it has been given; what’s more, if you’ve built some monstrous deep neural network, it might fail in mysterious ways or otherwise be hard to understand.

Supercharge Your Contact Center with Generative AI

Like used car salesmen or college history teachers, contact center managers need to understand the ways in which technology will change their business.

Machine learning is one such profoundly-impactful technology, and it can be used to automatically sort incoming messages by sentiment or priority and generally make your agents more effective.

Realizing this potential could be as difficult as hiring a team of expensive engineers and doing everything in-house, or as easy as getting in touch with us to see how we can integrate the Quiq conversational AI platform into your company.

If you want to get started quickly without spending a fortune, you won’t find a better option than Quiq.

Request A Demo

4 Benefits of Using Generative AI to Improve Customer Experiences

Generative AI has captured the popular imagination and is already changing the way contact centers work.

One area in which it has enormous potential is also one that tends to be top of mind for contact center managers: customer experience.

In this piece, we’re going to briefly outline what generative AI is, then spend the rest of our time talking about how generative AI benefits can improve customer experience with personalized responses, endless real-time support, and much more.

What is Generative AI?

As you may have puzzled out from the name, “generative AI” refers to a constellation of different deep learning models used to dynamically generate output. This distinguishes them from other classes of models, which might be used to predict returns on Bitcoin, make product recommendations, or translate between languages.

The most famous example of generative AI is, of course, the large language model ChatGPT. After being trained on staggering amounts of textual data, it’s now able to generate extremely compelling output, much of which is hard to distinguish from actual human-generated writing.

Its success has inspired a panoply of competitor models from leading players in the space, including companies like Anthropic, Meta, and Google.

As it turns out, the basic approach underlying generative AI can be utilized in many other domains as well. After natural language, probably the second most popular way to use generative AI is to make images. DALL-E, MidJourney, and Stable Diffusion have proven remarkably adept at producing realistic images from simple prompts, and just the past week, Fable Studios unveiled their “Showrunner” AI, able to generate an entire episode of South Park.

But even this is barely scratching the surface, as researchers are also training generative models to create music, design new proteins and materials, and even carry out complex chains of tasks.

What is Customer Experience?

In the broadest possible terms, “customer experience” refers to the subjective impressions that your potential and current customers have as they interact with your company.

These impressions can be impacted by almost anything, including the colors and font of your website, how easy it is to find e.g. contact information, and how polite your contact center agents are in resolving a customer issue.

Customer experience will also be impacted by which segment a given customer falls into. Power users of your product might appreciate a bevy of new features, whereas casual users might find them disorienting.

Contact center managers must bear all of this in mind as they consider how best to leverage generative AI. In the quest to adopt a shiny new technology everyone is excited about, it can be easy to lose track of what matters most: how your actual customers feel about you.

Be sure to track metrics related to customer experience and customer satisfaction as you begin deploying large language models into your contact centers.

How is Generative AI For Customer Experience Being Used?

There are many ways in which generative AI is impacting customer experience in places like contact centers, which we’ll detail in the sections below.

Personalized Customer Interactions

Machine learning has a long track record of personalizing content. Netflix, take to a famous example, will uncover patterns in the shows you like to watch, and will use algorithms to suggest content that checks similar boxes.

Generative AI, and tools like the Quiq conversational AI platform that utilize it, are taking this approach to a whole new level.

Once upon a time, it was only a human being that could read a customer’s profile and carefully incorporate the relevant information into a reply. Today, a properly fine-tuned generative language model can do this almost instantaneously, and at scale.

From the perspective of a contact center manager who is concerned with customer experience, this is an enormous development. Besides the fact that prior generations of language models simply weren’t flexible enough to have personalized customer interactions, their language also tended to have an “artificial” feel. While today’s models can’t yet replace the all-elusive human touch, they can do a lot to add make your agents far more effective in adapting their conversations to the appropriate context.

Better Understanding Your Customers and Their Journies

Marketers, designers, and customer experience professionals have always been data enthusiasts. Long before we had modern cloud computing and electronic databases, detailed information on potential clients, customer segments, and market trends used to be printed out on dead treads, where it was guarded closely. With better data comes more targeted advertising, a more granular appreciation for how customers use your product and why they stop using it, and their broader motivations.

There are a few different ways in which generative AI can be used in this capacity. One of the more promising is by generating customer journeys that can be studied and mined for insight.

When you begin thinking about ways to improve your product, you need to get into your customers’ heads. You need to know the problems they’re solving, the tools they’ve already tried, and their major pain points. These are all things that some clever prompt engineering can elicit from ChatGPT.

We took a shot at generating such content for a fictional network-monitoring enterprise SaaS tool, and this was the result:

 

While these responses are fairly generic [1], notice that they do single out a number of really important details. These machine-generated journal entries bemoan how unintuitive a lot of monitoring tools are, how they’re not customizable, how they’re exceedingly difficult to set up, and how their endless false alarms are stretching the security teams thin.

It’s important to note that ChatGPT is not soon going to obviate your need to talk to real, flesh-and-blood users. Still, when combined with actual testimony, they can be a valuable aid in prioritizing your contact center’s work and alerting you to potential product issues you should be prepared to address.

Round-the-clock Customer Service

As science fiction movies never tire of pointing out, the big downside of fighting a robot army is that machines never need to eat, sleep, or rest. We’re not sure how long we have until the LLMs will rise up and wage war on humanity, but in the meantime, these are properties that you can put to use in your contact center.

With the power of generative AI, you can answer basic queries and resolve simple issues pretty much whenever they happen (which will probably be all the time), leaving your carbon-based contact center agents to answer the harder questions when they punch the clock in the morning after a good night’s sleep.

Enhancing Multilingual Support

Machine translation was one of the earliest use cases for neural networks and machine learning in general, and it continues to be an important function today. While ChatGPT was noticeably very good at multilingual translation right from the start, you may be surprised to know that it actually outperforms alternatives like Google Translate.

If your product doesn’t currently have a diverse global user base speaking many languages, it hopefully will soon, at the means you should start thinking about multilingual support. Not only will this boost table stakes metrics like average handling time and resolutions per hour, it’ll also contribute to the more ineffable “customer satisfaction.” Nothing says “we care about making your experience with us a good one” like patiently walking a customer through a thorny technical issue in their native tongue.

Things to Watch Out For

Of course, for all the benefits that come from using generative AI for customer experience, it’s not all upside. There are downsides and issues that you’ll want to be aware of.

A big one is the tendency of large language models to hallucinate information. If you ask it for a list of articles to read about fungal computing (which is a real thing whose existence we discovered yesterday), it’s likely to generate a list that contains a mix of real and fake articles.

And because it’ll do so with great confidence and no formatting errors, you might be inclined to simply take its list at face value without double-checking it.

Remember, LLMs are tools, not replacements for your agents. They need to be working with generative AI, checking its output, and incorporating it when and where appropriate.

There’s a wider danger that you will fail to use generative AI in the way that’s best suited to your organization. If you’re running a bespoke LLM trained on your company’s data, for example, you should constantly be feeding it new interactions as part of its fine-tuning, so that it gets better over time.

And speaking of getting better, sometimes machine learning models don’t get better over time. Owing to factors like changes in the underlying data, model performance can sometimes get worse over time. You’ll need a way of assessing the quality of the text generated by a large language model, along with a way of consistently monitoring it.

What are the Benefits of Generative AI for Customer Experience?

The reason that people are so excited over the potential of using generative AI for customer experience is because there’s so much upside. Once you’ve got your model infrastructure set up, you’ll be able to answer customer questions at all times of the day or night, in any of a dozen languages, and with a personalization that was once only possible with an army of contact center agents.

But if you’re a contact center manager with a lot to think about, you probably don’t want to spend a bunch of time hiring an engineering team to get everything running smoothly. And, with Quiq, you don’t have to – you can leverage generative AI to supercharge your customer experience while leaving the technical details to us!

Schedule a demo to find out how we can bring this bleeding-edge technology into your contact center, without worrying about the nuts and bolts.

Footnotes
[1] It’s worth pointing out that we spent no time crafting the prompt, which was really basic: “I’m a product manager at a company building an enterprise SAAS tool that makes it easier to monitor system breaches and issues. Could you write me 2-3 journal entries from my target customer? I want to know more about the problems they’re trying to solve, their pain points, and why the products they’ve already tried are not working well.” With a little effort, you could probably get more specific complaints and more usable material.

Understanding the Risk of ChatGPT: What you Should Know

OpenAI’s ChatGPT burst onto the scene less than a year ago and has already seen use in marketing, education, software development, and at least a dozen other industries.

Of particular interest to us is how ChatGPT is being used in contact centers. Though it’s already revolutionizing contact centers by making junior agents vastly more productive and easing the burnout contributing to turnover, there are nevertheless many issues that a contact center manager needs to look out for.

That will be our focus today.

What are the Risks of Using ChatGPT?

In the following few sections, we’ll detail some of the risks of using ChatGPT. That way, you can deploy ChatGPT or another large language model with the confidence born of knowing what the job entails.

Hallucinations and Confabulations

By far the most well-known failure mode of ChatGPT is its tendency to simply invent new information. Stories abound of the model making up citations, peer-reviewed papers, researchers, URLs, and more. To take a recent well-publicized example, ChatGPT accused law professor Jonathan Turley of having behaved inappropriately with some of his students during a trip to Alaska.

The only problem was that Turley had never been to Alaska with any of his students, and the alleged Washington Post story which ChatGPT claimed had reported these facts had also been created out of whole cloth.

This is certainly a problem in general, but it’s especially worrying for contact center managers who may increasingly come to rely on ChatGPT to answer questions or to help resolve customer issues.

To those not steeped in the underlying technical details, it can be hard to grok why a language model will hallucinate in this way. The answer is: it’s an artifact of how large language models train.

ChatGPT learns how to output tokens from being trained on huge amounts of human-generated textual data. It will, for example, see the first sentences in a paragraph, and then try to output the text that completes the paragraph. The example below is the opening lines of J.D. Salinger’s The Catcher in the Rye. The blue sentences are what ChatGPT would see, and the gold sentences are what it would attempt to create itself:

“If you really want to hear about it, the first thing you’ll probably want to know is where I was born, and what my lousy childhood was like, and how my parents were occupied and all before they had me, and all that David Copperfield kind of crap, but I don’t feel like going into it, if you want to know the truth.”

Over many training runs, a large language model will get better and better at this kind of autocompletion work, until eventually it gets to the level it’s at today.

But ChatGPT has no native fact-checking abilities – it sees text and outputs what it thinks is the most likely sequence of additional words. Since it sees URLs, papers, citations, etc., during its training, it will sometimes include those in the text it generates, whether or not they’re appropriate (or even real.)

Privacy

Another ongoing risk of using ChatGPT is the fact that it could potentially expose sensitive or private information. As things stand, OpenAI, the creators of ChatGPT, offer no robust privacy guarantees for any information placed into a prompt.

If you are trying to do something like named entity recognition or summarization on real people’s data, there’s a chance that it might be seen by someone at OpenAI as part of a review process. Alternatively, it might be incorporated into future training runs. Either way, the results could be disastrous.

But this is not all the information collected by OpenAI when you use ChatGPT. Your timezone, browser type and IP address, cookies, account information, and any communication you have with OpenAI’s support team is all collected, among other things.

In the information age we’ve become used to knowing that big companies are mining and profiting off the data we generate, but given how powerful ChatGPT is, and how ubiquitous it’s becoming, it’s worth being extra careful with the information you give its creators. If you feed it private customer data and someone finds out, that will be damaging to your brand.

Bias in Model Output

By now, it’s pretty common knowledge that machine learning models can be biased.

If you feed a large language model a huge amount of text data in which doctors are usually men and nurses are usually women, for example, the model will associate “doctor” with “maleness” and “nurse” with “femaleness.”
This is generally an artifact of the data the models were trained, and is not due to any malfeasance on the part of the engineers. This does not, however, make it any less problematic.

There are some clever data manipulation techniques that are able to go a long way toward minimizing or even eliminating these biases, though they’re beyond the scope of this article. What contact center managers need to do is be aware of this problem, and establish monitoring and quality-control checkpoints in their workflow to identify and correct biased output in their language models.

Issues Around Intellectual Property

Earlier, we briefly described the training process for a large language model like ChatGPT (you can find much more detail here.) One thing to note is that the model doesn’t provide any sort of citations for its output, nor any details as to how it was generated.

This has raised a number of thorny questions around copyright. If a model has ingested large amounts of information from the internet, including articles, books, forum posts, and much more, is there a sense in which it has violated someone’s copyright? What about if it’s an image-generation model trained on a database of Getty Images?

By and large, we tend to think this is the sort of issue that isn’t likely to plague contact center managers too much. It’s more likely to be a problem for, say, songwriters who might be inadvertently drawing on the work of other artists.

Nevertheless, a piece on the potential risks of ChatGPT wouldn’t be complete without a section on this emerging problem, and it’s certainly something that you should be monitoring in the background in your capacity as a manager.

Failure to Disclose the Use of LLMs

Finally, there has been a growing tendency to make it plain that LLMs have been used in drafting an article or a contract, if indeed they were part of the process. To the best of our knowledge, there are not yet any laws in place mandating that this has to be done, but it might be wise to include a disclaimer somewhere if large language models are being used consistently in your workflow. [1]

That having been said, it’s also important to exercise proactive judgment in deciding whether an LLM is appropriate for a given task in the first place. In early 2023, the Peabody School at Vanderbilt University landed in hot water when it disclosed that it had used ChatGPT to draft an email about a mass shooting that had taken place at Michigan State.

People may not care much about whether their search recommendations were generated by a machine, but it would appear that some things are still best expressed by a human heart.

Again, this is unlikely to be something that a contact center manager faces much in her day-to-day life, but incidents like these are worth understanding as you decide how and when to use advanced language models.

Someone stopping a series of blocks from falling into each other, symbolizing the prevention of falling victim to ChatGPT risks.

Mitigating the Risks of ChatGPT

From the moment it was released, it was clear that ChatGPT and other large language models were going to change the way contact centers run. They’re already helping agents answer more queries, utilize knowledge spread throughout the center, and automate substantial portions of work that were once the purview of human beings.

Still, challenges remain. ChatGPT will plainly make things up, and can be biased or harmful in its text. Private information fed into its interface will be visible to OpenAI, and there’s also the wider danger of copyright infringement.

Many of these issues don’t have simple solutions, and will instead require a contact center manager to exercise both caution and continual diligence. But one place where she can make her life much easier is by using a powerful, out-of-the-box solution like the Quiq conversational AI platform.

While you’re worrying about the myriad risks of using ChatGPT you don’t also want to be contending with a million little technical details as well, so schedule a demo with us to find out how our technology can bring cutting-edge language models to your contact center, without the headache.

Footnotes
[1] NOTE: This is not legal advice.

Request A Demo

The Ongoing Management of an LLM Assistant

Technologies like large language models (LLMs) are amazing at rapidly generating polite text that helps solve a problem or answer a question, so they’re a great fit for the work done at contact centers.

But this doesn’t mean that using them is trivial or easy. There are many challenges associated with the ongoing management of an LLM assistant, including hallucinations and the emergence of bad behavior – and that’s not even mentioning the engineering prowess required to fine-tune and monitor these systems.

All of this must be borne in mind by contact center managers, and our aim today is to facilitate this process.

We’ll provide broad context by talking about some of the basic ways in which large language models are being used in business, discuss, setting up an LLM assistant, and then enumerate some of the specific steps that need to be taken in using them properly.

Let’s go!

How Are LLMs Being Used in Science and Business?

First, let’s adumbrate some of the ways in which large language models are being utilized on the ground.

The most obvious way is by acting as a generative AI assistant. One of the things that so stunned early users of ChatGPT was its remarkable breadth in capability. It could be used to draft blog posts, web copy, translate between languages, and write or explain code.

This alone makes it an amazing tool, but it has since become obvious that it’s useful for quite a lot more.

One thing that businesses have been experimenting with is fine-tuning large language models like ChatGPT over their own documentation, turning it into a simple interface by which you can ask questions about your materials.

It’s hard to quantify precisely how much time contact center agents, engineers, or other people spend hunting around for the answer to a question, but it’s surely quite a lot. What if instead you could just, y’know, ask for what you want, in the same way that you do a human being?

Well, ChatGPT is a long way from being a full person, but when properly trained it can come close where question-answering is concerned.

Stepping back a little bit, LLMs can be prompt engineered into a number of useful behaviors, all of which redound to the benefit of the contact centers which use them. Imagine having an infinitely patient Socratic tutor that could help new agents get up to speed on your product and process, or crafting it into a powerful tool for brainstorming new product designs.

There have also been some promising attempts to extend the functionality of LLMs by making them more agentic – that is, by embedding them in systems that allow them to carry out more open-ended projects. AutoGPT, for example, pairs an LLM with a separate bot that hits the LLM with a chain of queries in the pursuit of some goal.

AssistGPT goes even further in the quest to augment LLMs by integrating them with a set of tools that allow them to achieve objectives involving images and audio in addition to text.

How to Set Up An LLM Assistant

Next, let’s turn to a discussion of how to set up an LLM assistant. Covering this topic fully is well beyond the scope of this article, but we can make some broad comments that will nevertheless be useful for contact center managers.

First, there’s the question of which large language model you should use. In the beginning, ChatGPT was pretty much the only foundation model on offer. Today, however, that situation has changed, and there are now foundation models from Anthropic, Meta, and many other companies.

One of the biggest early decisions you’ll have to make is whether you want to try and use an open-source model (for which the code and the model weights are freely available) or a close-source model (for which they are not).

If you go the closed-source route you’ll almost certainly be hitting the model over an API, feeding it your queries and getting its responses back. This is orders of magnitude simpler than provisioning an open-source model, but it means that you’ll also be beholden to the whims of some other company’s engineering team. They may update the model in unexpected ways, or simply go bankrupt, and you’ll be left with no recourse.

Using an open-source alternative, of course, means grabbing the other horn of the dilemma. You’ll have visibility into how the model works and will be free to modify it as you see fit, but this won’t be worth much unless you’re willing to devote engineering hours to the task.

Then, there’s the question of fine-tuning large language models. While ChatGPT and LLMs more generally are quite good on their own, having them answer questions about your product or respond in particular ways means modifying their behavior somehow.

Broadly speaking, there are two ways of doing this, which we’ve mentioned throughout: proper fine-tuning, and prompt engineering. Let’s dig into the differences.

Fine-tuning means showing the model many (i.e. several hundred) examples of the behaviors you want to see, which changes its internal weights and biases it towards those behaviors in the future.

Prompt engineering, on the other hand, refers to carefully structuring your prompts to elicit the desired behavior. These LLMs can be surprisingly sensitive to little details in the instructions they’re provided, and prompt engineers know how to phrase their requests in just the right way to get what they need.

There is also some middle ground between these approaches. “One-shot learning” is a form of prompt engineering in which the prompt contains a singular example of the desired behavior, while “few-shot learning” refers to including between three and five examples.

Contact center managers thinking about using LLMs will need to think about these implementation details. If you plan on only lightly using ChatGPT in your contact center, a basic course on prompt engineering might be all you need. If you plan on making it an integral part of your organization, however, that most likely means a fine-tuning pipeline and serious technical investment.

The Ongoing Management of an LLM

Having said all this, we can now turn to the day-to-day details of managing an LLM assistant.

Monitoring the Performance of an LLM

First, you’ll need to continuously monitor the model. As hard as it may be to believe given how perfect ChatGPT’s output often is, there isn’t a person somewhere typing the responses. ChatGPT is very prone to hallucinations, in which it simply makes up information, and LLMs more generally can sometimes fall into using harmful or abusive language if they’re prompted incorrectly.

This can be damaging to your brand, so it’s important that you keep an eye on the language created by the LLMs your contact center is using.

And of course, not even LLMs can obviate the need to track the all-import key performance indicators. So far, there’s been one major study on generative AI in contact centers that found they increased productivity and reduced turnover, but you’ll still want to measure customer satisfaction, average handle time, etc.

There’s always a temptation to jump on a shiny new technology (remember the blockchain?), but you should only be using LLMs if they actually make your contact center more productive, and the only way you can assess that is by tracking your figures.

Iterative Fine-Tuning and Training

We’ve already had a few things to say about fine-tuning and the related discipline of prompt engineering, and here we’ll build on those preliminary comments.
The big thing to bear in mind is that fine-tuning a large language model is not a one-and-done kind of endeavor. You’ll find that your model’s behavior will drift over time (the technical term is “model degradation”), and this means you will likely to have to periodically re-train it.

It’s also common to offer the model “feedback”, i.e. by ranking it’s responses or indicating when you did or did not like a particular output. You’ve probably heard of reinforcement learning through human feedback, which is one version of this process, but there are also others you can use.

Quality Assurance and Oversight

A related point is that your LLMs will need consistent oversight. They’re not going to voluntarily improve on their own (they’re algorithms with no personal initiative to speak of), so you’ll need to checking in routinely to make sure they’re performing well and that your agents are using them responsibly.

There are many parts to this, including checks on the models outputs and an audit process that allows you to track down any issues. If you suddenly see a decline in performance, for example, you’ll need to quickly figure out whether it’s isolated to one agent or part of a larger pattern. If it’s the former, was it a random aberration, or did the agent go “off script” in a way that caused the model to behave poorly?

Take another scenario, in which an end-user was shown inappropriate text generated by an LLM. In this situation, you’ll need to take a deeper look at your process. If there were agents interacting with this model, ask them why they failed to spot the problematic text and stop it being shown to a customer. Or, if it came from a mostly-automated part of your tech stack, you need to uncover the reasons for which your filters failed to catch it, and perhaps think about keeping humans more in the loop.

The Future of LLM Assistants

Though the future is far from certain, we tend to think that LLMs have left Pandora’s box for good. They’re incredibly powerful tools which are poised to transform how contact centers and other enterprises operate, and experiments so far have been very promising; for all these reasons, we expect that LLMs will become a steadily more important part of the economy going forward.

That said, the ongoing management of an LLM assistant is far from trivial. You need to be aware at all times of how your model is performing and how your agents are using it. Though it can make your contact center vastly more productive, it can also lead to problems if you’re not careful.

That’s where the Quiq platform comes in. Our conversational AI is some of the best that can be found anywhere, able to facilitate customer interactions, automate text-message follow-ups, and much more. If you’re excited by the possibilities of generative AI but daunted by the prospect of figuring out how TPUs and GPUs are different, schedule a demo with us today.

Request A Demo

How Do You Train Your Agents in a ChatGPT World?

There’s long been an interest in using AI for educational purposes. Technologist Danny Hillis has spent decades dreaming of a digital “Aristotle” that would teach everyone in the way that the original Greek wunderkind once taught Alexander the Great, while modern companies have leveraged computer vision, machine learning, and various other tools to help students master complex concepts in a variety of fields.

Still, almost nothing has sparked the kind of enthusiasm for AI in education that ChatGPT and large language models more generally have given rise to. From the first, its human-level prose, knack for distilling information, and wide-ranging abilities made it clear that it would be extremely well-suited for learning.

But that still leaves the question of how. How should a contact center manager prepare for AI, and how should she change the way she trains her agents?

In our view, this question can be understood in two different, related ways:

  1. How can ChatGPT be used to help agents master skills related to their jobs?
  2. How can they be trained to use ChatGPT in their day-to-day work?

In this piece, we’ll take up both of these issues. We’ll first provide a general overview of the ways in which ChatGPT can be used for both education and training, then turn to the question of the myriad ways in which contact center agents can be taught to use this powerful new technology.

How is ChatGPT Used in Education and Training?

First, let’s get into some of the early ways in which ChatGPT is changing education and training.

NOTE: Our comments here are going to be fairly broad, covering some areas that may not be immediately applicable to the work contact center agents do. The main purpose for this is that it’s very difficult to forecast how AI is going to change contact center work.

Our section on “creating study plans and curricula”, for example, might not be relevant to today’s contact center agents. But it could become important down the road if AI gives rise to more autonomous workflows in the future, in which case we expect that agents would be given more freedom to use AI and similar tools to learn the job on their own.

We pride ourselves on being forward-looking and forward-thinking here at Quiq, and we structure our content to reflect this.

Making a Socratic Tutor for Learning New Subjects

The Greek philosopher Socrates famously pioneered the instructional methodology which bears his name. Mostly, the Socratic method boils down to continuously asking targeted questions until areas of confusion emerge, at which point they’re vigorously investigated, usually in a small group setting.

A well-known illustration of this process is found in Plato’s Republic, which starts with an attempt to define “justice” and then expands into a much broader conversation about the best way to run a city and structure a social order.

ChatGPT can’t replace all of this on its own, of course, but with the right prompt engineering, it does a pretty good job. This method works best when paired with a primary source, such as a textbook, which will allow you to double-check ChatGPT’s questions and answers.

Having it Explain Code or Technical Subjects

A related area in which people are successfully using ChatGPT is in having it walk you through a tricky bit of code or a technical concept like “inertia”.

The more basic and fundamental, the better. In our experience so far, ChatGPT has almost never failed in correctly explaining simple Python, Pandas, or Java. It did falter when asked to produce code that translates between different orbital reference frames, however, and it had no idea what to do when we asked it about a fairly recent advance in the frontiers of battery chemistry.

There are a few different reasons that we advise caution if you’re a contact center agent trying to understand some part of your product’s codebase. For one thing, if the product is written in a less-common language ChatGPT might not be able to help much.

But even more importantly, you need to be extremely careful about what you put into it. There have already been major incidents in which proprietary code and company secrets were leaked when developers pasted them into the ChatGPT interface, which is visible to the OpenAI team.

Conversely, if you’re managing teams of contact center agents, you should begin establishing a policy on the appropriate uses of ChatGPT in your contact center. If your product is open-source there’s (probably) nothing to worry about, but otherwise, you need to proactively instruct your agents on what they can and cannot use the tool to accomplish.

Rewriting Explanations for Different Skill Levels

Wired has a popular Youtube series called “5 levels”, where experts in quantum computing or the blockchain will explain their subject at five different skill levels: “child”, “teen”, “college student”, “grad student”, and a fellow “expert.”

One thing that makes this compelling to beginners and pros alike is seeing the same idea explored across such varying contexts – seeing what gets emphasized or left out, or what emerges as you gradually climb up the ladder of complexity and sophistication.

This, too, is a place where ChatGPT shines. You can use it to provide explanations of concepts at different skill levels, which will ultimately improve your understanding of them.

For a contact center manager, this means that you can gradually introduce ideas to your agents, starting simply and then fleshing them out as the agents become more comfortable.

Creating Study Plans and Curricula

Stepping back a little bit, ChatGPT has been used to create entire curricula and even daily study plans for studying Spanish, computer science, medicine, and various other fields.

As we noted at the outset, we expect it will be a little while before contact center agents are using ChatGPT for this purpose, as most centers likely have robust training materials they like to use.

Nevertheless, we can project a future in which these materials are much more bare-bones, perhaps consisting of some general notes along with prompts that an agent-in-training can use to ask questions of a model trained on the company’s documentation, test themselves as they go, and gradually build skill.

Training Agents to Use ChatGPT

Now that we’ve covered some of the ways in which present and future contact center agents might use ChatGPT to boost their own on-the-job learning, let’s turn to the other issue we want to tackle today: how to train ChatGPT to agents today?

Getting Set Up With ChatGPT (and its Plugins)

First, let’s talk about how you can start using ChatGPT.

This section may end up seeming a bit anticlimactic because, honestly, it’s pretty straightforward. Today, you can get access to ChatGPT by going to the signup page. There’s a free version and a paid version that’ll set you back a whopping $20/month (which is a pretty small price to pay for access to one of the most powerful artifacts the human race has ever produced, in our opinion.)

As things stand, the free tier gives you access to GPT-3.5, while the paid version gives you the choice to switch to GPT-4 if you want the more powerful foundational model.

A paid account also gives you access to the growing ecosystem of ChatGPT plugins. You access the ChatGPT plugins by switching over to the GPT-4 option:

How do you Train Your Agents in a ChatGPT World?

 

How do you Train Your Agents in a ChatGPT World?

 

There are plugins that allow ChatGPT to browse the web, let you directly edit diagrams or talk with PDF documents, or let you offload certain kinds of computations to the Wolfram platform.

Contact center agents may or may not find any of these useful right now, but we predict there will be a lot more development in this space going forward, so it’s something managers should know about.

Best Practices for Combining Human and AI Efforts

People have long been fascinated and terrified by automation, but so far, machines have only ever augmented human labor. Knowing when and how to offload work to ChatGPT requires knowing what it’s good for.

Large language models learn how to predict the next token from their training data, and are therefore very good at developing rough drafts, outlines, and more routine prose. You’ll generally find it necessary to edit its output fairly heavily in order to account for context and so that it fits stylistically with the rest of your content.

As a manager, you’ll need to start thinking about a standard policy for using ChatGPT. Any factual claims made by the model, especially any references or citations, need to be checked very carefully.

Scenario-Based Training

In this same vein, you’ll want to distinguish between different scenarios in which your agents will end up using generative AI. There are different considerations in using Quiq Compose or Quiq Suggest to format helpful replies, for example, and in using it to translate between different languages.

Managers will probably want to sit down and brainstorm different scenarios and develop training materials for each one.

Ethical and Privacy Considerations

The rise of generative AI has sparked a much broader conversation about privacy, copyright, and intellectual property.

Much of this isn’t particularly relevant to contact center managers, but one thing you definitely should be paying attention to is privacy. Your agents should never be putting real customer data into ChatGPT, they should be using aliases and fake data whenever they’re trying to resolve a particular issue.

To quote fictional chemist and family man Walter White, we advise you to tread lightly here. Data breaches are a huge and ongoing problem, and they can do substantial damage to your brand.

ChatGPT and What it Means for Training Contact Center Agents

ChatGPT and related technologies are poised to change education and training. They can be used to help get agents up to speed or to work more efficiently, and they, in turn, require a certain amount of instruction to use safely.

These are all things that contact center managers need to worry about, but one thing you shouldn’t spend your time worrying about is the underlying technology. The Quiq conversational AI platform allows you to leverage the power of language models for contact centers, without looking at any code more complex than an API call. If the possibilities of this new frontier intrigue you, schedule a demo with us today!

How Can AI Make Agents More Efficient?

From the invention of writing to quantum computing, emerging technologies have always had a profound impact on the way we work. New tools mean new products and services, new organizational structures, whole new markets, and sometimes even new methods of thought.

These days, the big news is coming out of artificial intelligence. Specifically, the release of ChatGPT has made it possible for everyone to try out an advanced AI application for the first time, and it has ignited a firestorm of speculation as to how industries ranging from medicine to copywriting might be transformed.

In this piece, we’re going to try to cut through the hype to give contact center managers some much-needed clarity. We’ll discuss what AI is useful for, how it will change how contact center agents function daily, and what tools they should investigate to get the most out of AI.

What Is AI Useful For?

Artificial intelligence is a pretty broad category, encompassing everything from the most basic linear regressions to the remarkable sophistication of deep reinforcement learning agents.

This is too much territory to cover in a single blog post, but we can nevertheless make some useful general comments.

The way we see it, there are essentially two ways that AI is useful: it can either completely replace a human for certain tasks, allowing them to shift their focus to higher-value work, or it can augment their process, allowing them to reach insights or achieve objectives that would’ve taken much longer otherwise.

Take the example of ChatGPT, a large language model trained on huge quantities of human-generated text that is able to write poetry, generate math proofs, create functioning code, and much more.

For certain tasks – like generating blog post titles or short email blasts – ChatGPT is good enough to supplant humans altogether. But if you’re trying to learn a complex subject like organic chemistry, it’s best to treat ChatGPT more like a conversational partner. You can ask it questions or use it to test your understanding of a concept, but you have to be careful with its output because it might be hallucinating or otherwise getting important facts wrong. [1]

Since ChatGPT and large language models more generally are what everyone is focused on at the moment, it’s what we’ll be discussing throughout this essay.

How is AI Changing How Contact Center Agents Work?

A woman smiling as she interacts with generative AI on her laptop.

As soon as ChatGPT was released it spawned an unending stream of hot takes, from “this is going to completely automate the entire economy” to “this is going to be a huge flop that no one finds particularly useful.”

Recently, a study by Erik Brynjolfsson, Danielle Li, and Lindsey R. Raymond called “Generative AI at Work” examined how LLMs are being used in contact centers. They found that both these perspectives were wrong: generative AI was not completely automating contact centers but was proving enormously helpful in making contact centers more efficient.

Specifically, LLMs were able to capture some of the conversational patterns and general tacit knowledge held by more senior agents and transfer it to more junior agents. The result was more productivity among these less experienced workers, less overall turnover, and a better customer experience.

To help flesh this picture out, we’ll now turn to examining some specific ways this works.

Large Language Models are Helping Agents Work Faster

There are a few ways that LLMs are helping agents get their jobs done more quickly and efficiently.

One is by helping them cut down on typing by providing contextually appropriate responses to customer questions, which is exactly what Quiq Compose does.

Quiq Compose learns from interactions between contact center agents and customers. It can take a barebones outline of a reply (“Nope, you waited too long to return the product…”) and flesh it out into a full, coherent, grammatical response (“I’m so sorry to hear that the product isn’t working as intended…”.)

Quiq Suggest also learns from multiple agent-customer interactions, but it offers real-time suggestions. As your contact center agents begin typing their responses, the underlying model offers a robust form of autocomplete to help them craft replies more quickly. This substantially reduces the amount of time that agents have to spend up to 30% less time hunting around for information and tweaking their language to be both polite and informative.

What’s more, because Quiq Suggest leverages lightweight “edge” language models trained on a specific company’s data, it’s able to run very quickly.

Another way you can reduce agent handling time is by simply cutting down on the amount of text a given agent has to process. In the course of resolving an issue, there will usually be some extraneous text, like “Thanks!” or “Have a good day!” When Quiq’s conversational AI platform sees these unimportant messages, it automatically filters them and tacks them on to the end of the transcript.

Finally, a lot of friction and information loss can occur when a conversation is transferred between agents, or from an AI to a human agent. This is where conversation summarization comes in handy. By automatically summarizing the interaction so far, these transfers can take less time and energy, which also contributes to lower agent burnout and higher customer satisfaction.

Large Language Models can provide 24/7 Customer Support

There’s a fundamental asymmetry in running a great contact center, inasmuch as problems can occur around the clock but your agents need to sleep, rest, and play frisbee golf.

Unless, of course, some of your agents aren’t human. One of the great advantages of computers and algorithms is that they have none of the human frailties that prevent us all from working every hour of the day. They have no need for sleep, bathroom breaks, or recreation.

If you’re using a powerful conversational AI platform like Quiq, you can have AI agents deployed every hour, day or night, answering questions, completing tasks, and resolving problems.

Of course, the technology is not yet good enough to handle everything a contact center agent would handle, and some issues will have to be postponed until the humans punch the clock. Still, with the right tools, your operation can constantly be moving forward.

Large Language Models Can Help With Documentation

Writing documentation is one of those crucial, un-sexy tasks that businesses ignore at their own peril. Everyone wants to be coding up a blockchain or demo-ing a shiny new application to well-heeled investors, but someone needs to be sitting and writing up product specs, troubleshooting workflows, and all the other text that helps an organization function effectively.

This, too, is something that AI can help with. Whether it’s brainstorming an outline, identifying common sticking points, or even writing the document wholesale, more and more technical organizations are exploring LLMs to speed up their documentation efforts.

Just remember that LLMs like ChatGPT are extremely prone to hallucinations, so carefully fact-check everything they produce before you add it to your official documentation.

Large Language Models Can Help With Marketing

A final place where AI is proving incredibly useful is in marketing. Whether or not your agents have any input into your marketing depends on how you run your contact center, but this piece wouldn’t be complete without at least briefly touching upon marketing.

One obvious way that this can work is by having ChatGPT generate headlines, subject lines, Tweets, or even SEO-optimized blog posts.

But this is not the only way AI can be used in marketing. One very clever use of the technology that we’ve encountered is having ChatGPT generate customer journeys or customer diary entries. If your product is targeting men in their 40s who aren’t crushing life they way they used to, for example, it can create a month’s worth of forum posts from your target buyers discussing their lack of drive and motivation. This, in turn, will furnish targeted language you can use in your copy.

But bear in mind that marketing is one of those things that’s just incredibly subtle. It takes all of 30 seconds to come up with a few headlines for an email, but the difference between an okay headline and an extraordinary one can be a single word. Here, as elsewhere, it’s wise to have the final word remain with the humans.

Working more Quiq-ly

The world is changing, and contact centers are changing along with it. If you expect to retain a competitive edge and a top-notch contact center, you’ll need to utilize the latest technologies.

One way you could do this is by paying an expensive engineering team to build your own LLMs and AI tooling. But a much easier way is to integrate our Quiq conversational AI platform into your contact center. Whether it’s automatic summarization, filtering trivial messages, or using Quiq Suggest and Quiq Compose to cut down on average handle time, we have a product that will streamline your operation. Schedule a demo with us today to see how we can help you!

Footnotes
[1] You could argue that both of these examples boil down to the same thing. That is, even when you treat ChatGPT as a sounding board you’re really just replacing a human being that could’ve performed the same function. This is a plausible point of view, but we still think it’s useful to distinguish between “ChatGPT acting like a total replacement for a human for certain boilerplate tasks” and “ChatGPT augmenting a human’s workflow by acting like an idea generator or conversational partner.” Reasonable people could disagree on this, and your mileage may vary.

Request A Demo