Forrester Report: The State of Conversational AI Read the report —>

Transforming Business with a Voice AI Platform

The CX Leaders’ Guide to Crush the Competition

As customer expectations continue to rise, businesses are under increasing pressure to provide fast, efficient, and personalized service around the clock. For CX leaders, keeping up with these demands can be challenging—especially when scaling operations while maintaining high-quality support. This is where a Voice AI platform can give you the edge you need to transform your customer service strategy.

Voice AI is a powerful technology that rapidly changes how businesses interact with their customers. From automating routine tasks to delivering personalized experiences at scale, Voice AI is truly changing the game in customer service.

This article explores how Voice AI works, why it’s more effective than traditional human agents in many areas, and the numerous benefits it offers businesses today. Whether you’re just beginning to explore Voice AI or looking to enhance your existing systems, this guide will provide valuable insights into how this technology can revolutionize your approach to customer service and help you stay ahead of the competition.

What is Voice AI and how does it work?

Voice AI refers to technology that uses artificial intelligence to understand, process, and respond to human speech. It enables machines to engage in natural, spoken conversations with users, typically using tools like automatic speech recognition (ASR), natural language processing (NLP), and text-to-speech (TTS) to convert spoken language into text, interpret intent, and generate verbal responses. Voice AI is commonly used in applications like virtual assistants, customer service, and AI agents.

A Voice AI platform is a comprehensive system that enables businesses to build, deploy, and manage voice-based interactions powered by artificial intelligence. It integrates all the technologies mentioned above into a single solution to provide tools to design conversational flows, customize voice AI agents, handle customer queries, and gather analytics to improve performance. Voice AI platforms are now being used to create virtual assistants, customer service, and contact centers to improve the customer experience and change the way customers interact with businesses.

Voice AI enables machines to understand, interpret, and respond to human speech in a natural, conversational way. Unlike the early days of chatbots and voice systems, today’s Voice AI agents are powered by advanced technologies like Large Language Models (LLMs), such as GPT and many others, which allow them to comprehend and respond to human language with impressive accuracy.

LLMs are trained on massive amounts of data, allowing them to recognize patterns in speech, understand different languages and dialects, and predict what a user asks—even if the question is phrased unexpectedly. This is a major leap from traditional, rules-based systems, which required rigid, predefined scripts to function. Now, Voice AI agents can understand intent and context, making conversations feel much more natural and intuitive.

When a customer speaks to a Voice AI agent, the process begins with Automatic Speech Recognition (ASR), which converts spoken language into text. Once the words are captured, Natural Language Processing (NLP) kicks in to interpret the meaning behind the words, analyzing context and intent to determine the best response. From there, the system generates a response using the LLM’s understanding of language, and finally, the text is converted back into speech with Text-to-Speech (TTS) technology. All this happens in real time through a conversational voice AI platform, allowing the customer to interact with the AI as if they were speaking to a human agent.

For CX leaders, this changes everything for your customers. Voice AI agents don’t just hear words—they understand them. This allows you to handle a wide range of customer inquiries, from simple requests like checking order status to more complex tasks such as troubleshooting issues or answering product questions. And because these AI agents constantly learn from each interaction, they continue to improve over time, becoming more efficient and effective with each use.

How Voice AI compares to human agents

One of the most common questions CX leaders ask is how Voice AI compares to human agents. While it’s true that AI agents can’t fully replace the human touch in all interactions, they offer several key advantages that are transforming customer service for the better.

1. Consistency and speed

Unlike human agents, who may vary in their responses or make mistakes, Voice AI agents provide consistent answers every time—as long as they have appropriate guardrails in place to prevent hallucinations. Without these safeguards, AI can generate misinformation and fail to meet customer needs. Properly trained AI agents, equipped with the necessary guardrails, can instantly access vast amounts of information and handle inquiries quickly, making them ideal for routine or frequently asked questions. This ensures that customers receive fast, accurate responses without the need to wait on hold or navigate through multiple layers of human interaction.

2. 24/7 availability

One of the biggest benefits of Voice AI is that it’s available around the clock. Whether it’s the middle of the night or during peak business hours, AI agents are always on, ready to assist customers whenever they need help. This is especially useful for global businesses that operate in different time zones, ensuring customers can access support without delay, no matter where they are.

3. Scalability

Another advantage of Voice AI agents is their ability to scale. Unlike human agents, who can only handle one conversation at a time, AI agents can manage thousands of interactions simultaneously. This makes them particularly valuable during busy periods, such as holidays or product launches, when call volumes can surge. Instead of overwhelming your team, Voice AI ensures that all customers receive the same high level of service, even during high-demand times.

Where human agents excel

Of course, there are still areas where human agents outperform AI. Empathy and emotional intelligence are crucial in customer service, especially when dealing with complex or sensitive issues.

While AI agents can empathize with a user who expresses emotion, they are limited in their ability to interpret those emotions, and lack the personal touch that human agents excel at. Similarly, when faced with complex problem solving that requires out-of-the-box thinking, a human agent’s creativity and judgment are often more effective.

For these reasons, many businesses find that a hybrid approach works best—using Voice AI to handle routine or straightforward tasks, while allowing human agents to focus on more complex or emotionally charged interactions. This not only ensures that customers receive the right level of support, but also frees up human agents to do what they do best: solve problems and connect with customers on a personal level.

5 big business impacts of Voice AI

Now that we’ve covered how Voice AI works and compares to human agents, let’s review the specific benefits it offers businesses. From cost savings to enhanced customer experiences, Voice AI is transforming the way companies approach customer service.

1. Increased efficiency and reduced costs

One of the most significant advantages of using Voice AI is the ability to automate routine tasks. Instead of relying on human agents to handle simple inquiries—such as checking an order status, answering frequently asked questions, or processing payments—AI agents can manage these tasks automatically. This reduces the workload for human agents, allowing them to focus on more strategic or complex issues.

As a result, businesses can operate with smaller customer service teams, reducing labor costs and maintaining high levels of service. This not only saves money, but also improves efficiency, as AI agents can handle repetitive tasks faster and more accurately than humans.

2. Scalability and availability

As mentioned earlier, Voice AI offers unmatched scalability. Whether you’re dealing with a sudden spike in customer inquiries or operating across multiple time zones, Voice AI agents ensure no customer is left waiting. They can manage unlimited interactions simultaneously, providing the same level of service to every customer, regardless of demand.

In addition, because Voice AI operates 24/7, businesses no longer need to worry about staffing for off-hours or paying overtime for extended support. This around-the-clock availability ensures that customers can access help whenever they need it, improving overall satisfaction and reducing wait times.

3.  Enhanced customer experience

One of the most important factors in customer service is the customer experience, and Voice AI has the potential to dramatically improve it. By offering quick, consistent responses, Voice AI agents eliminate the frustration that comes with long wait times and inconsistent answers from human agents.

Voice AI can also deliver personalized interactions by integrating with CRM systems and accessing customer data. This allows the AI to address customers by name, understand their preferences, and provide tailored solutions based on their history with the company. The result is a more engaging and satisfying customer experience that leaves customers feeling valued and understood.

4. Multilingual support

For businesses with different cultural customers, Voice AI offers another important advantage: multilingual support. Many Voice AI systems are equipped to handle multiple languages, making it easier for companies to support customers around the world. Voice AI can detect if your customer is speaking another language and translate their response instantaneously. This extraordinarily personalized service can transform your customer’s interactions and prevent many of the issues that come with language barriers. This not only improves accessibility, but also enhances the overall customer experience, as customers can interact with the company in their preferred language.

5. Improved data collection and insights

In addition to its customer-facing benefits, Voice AI also provides businesses with valuable insights into customer behavior. Because AI agents record and analyze every interaction, companies can use this data to identify trends, spot recurring issues, and improve their products or services based on customer feedback.

By understanding what customers are asking for and where they’re encountering problems, businesses can refine their customer support strategy, improve AI performance, and even identify areas where additional human intervention may be needed. This data-driven approach allows CX leaders to make more informed decisions and continuously improve their customer service operations.

Voice AI for CX is just getting started

Looking ahead, the future of Voice AI is full of exciting possibilities. As technology continues to advance, the gap between human and AI interactions will continue to narrow, with Voice AI agents becoming even more capable of handling complex and nuanced conversations.

Future developments in emotional intelligence and predictive capabilities will allow AI agents to not only understand what customers say, but also anticipate what they need before they even ask.

For CX leaders, staying ahead of these developments is crucial. Businesses that invest in a Voice AI platform will be well-positioned to reap the benefits of these advancements as they continue to evolve. By embracing Voice AI as part of a hybrid approach, companies can create a customer service model that combines the efficiency and scalability of AI with the empathy and creativity of human agents.

Voice AI is a game-changer for CX leaders

The rise of Voice AI revolutionizes how businesses approach customer service, offering scalable, cost-effective, and personalized solutions that were once only possible with large teams of human agents. From automating routine tasks to providing real-time, tailored experiences, Voice AI is reshaping customer interactions in ways that improve efficiency and satisfaction.

For CX leaders, the time to embrace Voice AI is now. As this technology continues to evolve, it will become an even more powerful tool for delivering superior customer experiences, optimizing operational processes, and driving long-term success. By integrating Voice AI agents into your customer service strategy, you can balance the speed and consistency of AI with the empathy and creativity of human agents—building a support system that meets the ever-growing demands of today’s customers.

Voice AI is not just a trend, it’s the future of customer service. Staying ahead of the curve now will ensure your business remains competitive in a world where customer expectations continue to rise, and the need for seamless, personalized interactions becomes even more critical.

Why Even the Best Conversational AI Chatbot Will Fail Your CX

As author, speaker, and customer experience expert Dan Gingiss wrote in his book The Experience Maker, “Most companies must realize that they are no longer competing against the guy down the street or the brand that sells similar products. Instead, they’re competing with every other experience a customer has.”

That’s why so many CX leaders were (cautiously!) optimistic when Generative AI (GenAI) hit the scene, promising to provide instant, round-the-clock responses and faster issue resolutions, automate personalization at scale, and free agents to focus on more complex issues. So much so that a whopping 80% of companies worldwide now have chatbots on their websites.

Yet despite all the hype and good intentions, a recent survey showed that consumers give their chatbot experiences an average rating of 6.4/10 — which isn’t a passing grade in school, and certainly won’t cut it in business.

So why have chatbots fallen so short of company and consumer expectations? The short answer is because they’re not AI agents. Chatbots rely on rigid, rule-based systems. They struggle to understand context and adapt to complex or nuanced questions. Even the best conversational AI chatbot doesn’t have what it takes to enable CX leaders to create seamless customer journeys. This is why they so often fail at driving outcomes like revenue and CSAT.

Let’s look at the most impactful differences between these two AI for CX solutions, including why even the best conversational AI chatbots are failing CX teams and their customers — and how AI agents are changing the game.

Chatbots: First-generation AI and Intent-based Responses

AI is advancing at lightning speed, so it should come as no surprise that many vendors are having trouble keeping up. The truth is that most AI for CX tools still offer chatbots built on first-generation AI, rather than AI agents that are powered by the latest and greatest Large Language Models (LLMs).

This first-generation AI is rule-based and uses Natural Language Processing (NLP) to attempt to match users’ questions to specific, pre-defined queries and responses. In other words, CX teams must create lists of different ways users might pose the same question or request, or “intents.” AI does its best to determine which “intent” a user’s message aligns with, and then sends what has been labeled the “correct” corresponding response.

Best Conversational AI Chatbot

This approach can cause many problems that ultimately add friction to the customer journey and create frustrating brand experiences, including:

  • Intent limitations: If a user asks a multi-part question (e.g. “Can I unsubscribe from your newsletter and have sales contact me?”), the bot will recognize and answer only one intent and ignore the other, which is insufficient.
  • Ridged paths: If a user asks a question that the bot knows requires additional information, it will start the user down a rigid, predefined path to collect that information. If the user provides additional relevant details (e.g. “I would still like to receive customer-only emails”), the bot will continue to push them down this specific path before providing an answer.
    On the other hand, if the user asks an unrelated follow-up question, the bot will zero in on this new “intent” and start the user down a new path, abandoning the previous flow without resolving their original inquiry.
  • Confusing intents: There are countless ways to phrase the same request, so the likelihood of a user’s inquiry not matching a predefined intent is high (e.g. “I want you to delete my contact info!”). In this case, the bot doesn’t know what to do and must escalate to a live agent — or worse, it misunderstands the user’s intent and sends the wrong response.
  • Conflicting intents: Because similar words and phrases can appear across unrelated issues, there is often contention across predefined intents (e.g. “I accidentally unsubscribed from your newsletter.”). Even the best conversational AI chatbot is likely to match the user’s intent with the wrong response and deliver an unrelated and seemingly nonsensical answer — an issue similar to hallucinations.

Some AI for CX vendors claim their chatbots use the most advanced GenAI. However, they are really using only a fraction of an LLM’s power to generate a response from a knowledge base, rather than crafting personalized answers to specific questions. But because they still use the same outdated, intent-based process to determine the user’s request, the LLM will still struggle to generate a sufficient, appropriate response — if the issue isn’t escalated to a live agent first, that is.

AI Agents: Cutting-edge Models with Reasoning Capabilities

Top AI for CX vendors use the latest and greatest LLMs to power every step of the customer interaction, not just at the end to generate a response. This results in a much more accurate, personalized, and empathetic experience, enabling them to provide clients with AI agents — not chatbots.

Best Conversational AI Chatbot

Rather than relying on rigid intent classification, AI agents use LLMs to comprehend language and genuinely understand a user’s request, much like humans do. They can also contextualize the question and append the conversation with additional attributes accessed from other CX systems, such as a person’s location or whether they are an existing customer (more on that in this guide).

This level of reasoning is achieved through business logic, which guides the conversation flow through a series of “pre-generation checks” that happen in the background in mere seconds. These require the LLM to first answer “questions about the question” before generating a response, including if the request is in scope, sensitive in nature, about a specific product or service, or requires additional information to answer effectively.

Best Conversational AI Chatbot

 

Best Conversational AI Chatbot

The same process happens after the LLM has generated a response (“post-generation checks”), where the LLM must answer “questions about the answer” to ensure that it’s accurate, in context, on brand, etc. Leveraging the reasoning power of LLMs coupled with this conversational framework enables the AI agent to outperform even the best conversational AI chatbots in many key areas.

Providing sufficient answers to multi-part questions

Unlike a chatbot, the agent is not trying to map a specific question to a single, canned answer. Instead, it’s able to interpret the entirety of the user’s question, identify all relevant knowledge, and combine it to generate a comprehensive response that directly answers the user’s inquiry.

Dynamically answering unrelated questions and factoring in new information

AI agents will prompt users to provide additional information as needed to effectively respond to their requests. However, if the user volunteers additional information, the agent will factor this into the context of the larger conversation, rather than continuing to force them down a step-by-step path like a chatbot does. This effectively bypasses the need for many disambiguating questions.

Similarly, if a user asks an unrelated follow-up question, the agent will respond to the question without losing sight of the original inquiry, providing answers and maintaining the flow of the conversation while still collecting the information it needs to solve the original issue.

Understanding nuances

Unlike chatbots, next-gen AI agents excel at comprehending human language and picking up on nuances in user questions. Rather than having to identify a user’s intent and match it with the correct, predefined response, they can recognize that similar requests can be phrased differently, and that dissimilar questions may contain many of the same words. This allows them to flexibly understand users’ questions and identify the right knowledge to generate an accurate response without requiring an exact match.

Best Conversational AI Chatbot

It’s also worth noting that first-generation AI vendors often force clients to build a new chatbot for every channel: voice, SMS, Facebook Messenger, etc. Not only does this mean a lot of duplicate work for internal teams on the back end, but it can also lead to disjointed brand experiences on the front end. In contrast, next-generation AI for CX vendors allows clients to build a single agent and run it across multiple channels for a more seamless customer journey.

Is Your “Best-in-Class” AI Chatbot Killing Your Customer Journey?

Some 80% of customers say the experience a company provides is equally as important as its products and services. However, according to Gartner, more than half of large organizations have failed to unify customer engagement channels and provide a streamlined experience across them.

As you now know, even the best conversational AI chatbot will exacerbate rather than improve this issue. Our latest guide deep dives into more ways your chatbot is harming CX, from offering multi-channel-only support to measuring the wrong things, as well as the steps you can take to provide consumers with a more seamless journey. You can give it a read here!

Evolving the Voice AI Chatbot: From Bots to Voice AI Agents & Their Impact on CX Leaders

Voice AI has come a long way from its humble beginnings, evolving into a powerful tool that’s reshaping customer service. In this blog, we’ll explore how Voice AI has grown to address its early limitations, delivering impactful changes that CX leaders can no longer ignore. Learn how these advancements create better customer experiences, and why staying informed is essential to staying competitive.

The Voice AI Journey

Customer expectations have evolved rapidly, demanding faster and more personalized service. Over the years, voice interactions have transformed from rigid, rules-based AI chatbot with voice systems to today’s sophisticated AI-driven solutions. For CX leaders, Voice AI has emerged as a crucial tool for driving service quality, streamlining operations, and meeting customer needs more effectively.

Key Concepts

Before diving into this topic, readers, especially CX leaders, should be familiar with the following key terms to better understand the technology and its impact. The following is not a comprehensive list, but should provide the background to clarify terminology and identify the key aspects that have contributed to this evolution.

Speech-enabled systems vs. chatbots vs. AI agents

  • Speech-enabled systems: Speech-enabled systems are basic tools that convert spoken language into text, but do not include advanced features like contextual understanding or decision-making capabilities.
  • Chatbots: Chatbots are systems that interact with users through text, answering questions, and completing tasks using either set rules or AI to understand user inputs.
  • AI agents: AI agents are smart conversational systems that help with complex tasks, learn from interactions, and adjust their responses to offer more personalized and relevant assistance over time.

Rules-based (previous generation) vs. Large Language Models or LLMs (next generation)

  • Previous gen: Lacks adaptability, struggles with natural language nuances, and fails to offer a personalized experience.
  • Next-gen (LLM-based): Uses LLMs to understand intent, generate responses, and evolve based on context, improving accuracy and depth of interaction.

Agent Escalation: A process in which the Voice AI system hands off the conversation to a human agent, often seamlessly.

AI Agent: A software program that autonomously performs tasks, makes decisions, and interacts with users or systems using artificial intelligence. It can learn and adapt over time to improve its performance, commonly used in customer service, automation, and data analysis.

Depending on their purpose, AI agents can be customer-facing or assist human agents by providing intelligent support during interactions. They function based on algorithms, machine learning, and natural language processing to analyze inputs, predict outcomes, and respond in real-time.

Automated Speech Recognition (ASR): The technology that enables machines to understand and process human speech. It’s a core component of Voice AI systems, helping them identify spoken words accurately.

Context Awareness: Voice AI’s ability to remember previous interactions or conversations, allowing it to maintain a flow of dialogue and provide relevant, contextually appropriate responses.

Conversational AI: Conversational AI refers to technologies that allow machines to interact naturally with users through text or speech, using tools like LLMs, NLU, speech recognition, and context awareness.

Conversation Flow: The logical structure of a conversation, including how the Voice AI chatbot guides interactions, asks follow-up questions, and handles different branches of user input.

Generative AI: A type of artificial intelligence that creates new content, such as text, images, audio, or video, by learning patterns from existing data. It uses advanced models, like LLMs, to generate outputs that resemble human-made content. Generative AI is commonly used in creative fields, automation, and problem-solving, producing original results based on the data it has been trained on.

Intent Recognition: The process by which a Voice AI system identifies the user’s goal or purpose behind their speech input. Understanding intent is critical to delivering appropriate and relevant responses.

LLMs: LLMs are sophisticated machine learning systems trained on extensive text data, enabling them to understand context, generate nuanced responses, and adapt to the conversational flow dynamically.

Machine Learning (ML): A type of AI that allows systems to automatically learn and improve from experience without being explicitly programmed. ML helps voice AI chatbots adapt and improve their responses based on user interactions.

Multimodal: The ability of a system or platform to support multiple modes of communication, allowing customers and agents to interact seamlessly across various channels.

Multi-Turn Conversations: This refers to the ability of Voice AI systems to engage in extended dialogues with users across multiple steps. Unlike simple one-question, one-response setups, multi-turn conversations handle complex interactions.

Natural Language Processing (NLP): Consists of a branch of AI that helps computers understand and interpret human language. It is the key technology behind voice and text-based AI interactions.

Omnichannel Experience: A seamless customer experience that integrates multiple channels (such as voice, text, and chat) into one unified system, allowing customers to seamlessly transition between them.

Rules-based approach: This approach uses predefined scripts and decision trees to respond to user inputs. These systems are rigid, with limited conversational abilities, and struggle to handle complex or unexpected interactions, leading to a less flexible and often frustrating user experience.

Sentiment Analysis: A feature of AI that interprets the emotional tone of a user’s input. Sentiment analysis helps Voice AI determine the customer’s mood (e.g., frustrated or satisfied) and tailor responses accordingly.

Speech Recognition / Speech-to-Text (STT): Speech Recognition, or Speech-to-Text (STT), converts spoken language into text, allowing the system to process it. It’s a key step in making voice-based AI interactions possible.

Text-to-Speech (TTS): The opposite of STT, TTS refers to the process of converting text data into spoken language, allowing digital solutions to “speak” responses back to users in natural language.

Voice AI: Voice AI is a technology that uses artificial intelligence to understand and respond to spoken language, allowing machines to have more natural and intuitive conversations with people.

Voice User Interface (VUI): Voice User Interface (VUI) is the system that enables voice-based interactions between users and machines, determining how naturally and effectively users can communicate with Voice AI systems.

The humble beginnings of rules-based voice systems

Voice AI has been nearly 20 years in the making, starting with basic rules-based systems that followed predefined scripts. These early systems could automate simple tasks, but if customers asked anything outside the programmed flow, the system fell short. It couldn’t handle natural language or adapt to the unexpected, leading to frustration for both customers and CX teams.

For CX leaders, these systems posed more challenges than solutions. Robotic interactions often required human intervention, negating the efficiency benefits. It became clear that something more flexible and intelligent was needed to truly transform customer service.

The rise of AI and speech-enabled systems

As businesses encountered the limitations of rules-based systems, the next chapter in the evolution of Voice AI introduced speech-enabled systems. These systems were a step forward, as they allowed customers to interact more naturally with technology by transcribing spoken language into text. However, while they could accurately convert speech to text which solved one issue, they still struggled with a critical challenge—they couldn’t grasp the underlying meaning or the sentiment behind the words.

This gap led to the emergence of the first generation of AI, which represented a significant improvement over simple chatbots. This intelligence improved more helpful for customer interactions, but they still fell short in providing the seamless, human-like conversations that CX leaders envisioned. While customers could speak to AI-powered systems, the experience was often inconsistent, especially when dealing with complex queries. The advancement of AI was another improvement, but it was still limited by the rules-based logic it evolved from.

The challenge stemmed from the inherent complexity of language. People express themselves in diverse ways, using different accents, phrasing, and expressions. Language rarely follows a single, rigid pattern, which made it difficult for early speech systems to interpret accurately.
These AI systems were a huge leap in progress and created hope for CX leaders. Intelligent systems that can adapt and respond to users’ speech were powerful, but not enough to make a full transformation in the CX world.

The AI revolution: From rules-based to next-gen LLMs

The real breakthrough came with the rise of LLMs. Unlike rigid rules-based systems, LLMs use neural networks to understand context and intent, creating true natural, fluid human-like conversations. Now, AI could respond intelligently, adapt to the flow of interaction, and provide accurate answers.

For CX leaders, this was a pivotal moment. No more frustrating dead ends or rigid scripts—Voice AI became a tool that could offer context-aware services, helping businesses cut costs while enhancing customer satisfaction. The ability to deliver meaningful, efficient service marked a turning point in customer engagement.

What makes Voice AI work today?

Today’s Voice AI systems combine several advanced technologies:

  • Speech-to-Text (STT): Converts spoken language into text with high accuracy.
  • AI Intelligence: Powered by NLU and LLMs, the AI deciphers customer intent and delivers contextually relevant responses.
  • Text-to-Speech (TTS): Translates the AI’s output back into natural-sounding speech for smooth and realistic communication.

These technologies work together to enable smarter, faster service, reduce the load on human agents and provide an intuitive customer experience.

The transformation: What changed with next-gen Voice AI?

With advancements in NLP, ML, and omnichannel integration, Voice AI has evolved into a dynamic, intelligent system capable of delivering personalized, empathetic responses. Machine Learning ensures that the system learns from every interaction, continuously improving its performance. Omnichannel integration allows Voice AI to operate seamlessly across multiple platforms, providing a unified customer experience. This is crucial for the transformation of customer service.

Rather than simply enhancing voice interactions, omnichannel solutions select the best communication channel within the same interaction, ensuring customers receive a complete answer and any necessary documentation to resolve their issue – via email or SMS.

For CX leaders, this transformation enables them to offer real-time, personalized service, with fewer human touchpoints and greater customer satisfaction.

The four big benefits of next-gen Voice AI for CX leaders

The rise of next-gen Voice AI from previous-gen Voice AI chatbots offers CX leaders powerful benefits, transforming how they manage customer interactions. These advancements not only enhance the customer experience, but also streamline operations and improve business efficiency.

1. Enhanced customer experience

With faster, more accurate, and context-aware responses, Voice AI can handle complex queries with ease. Customers no longer face frustrating dead ends or robotic answers. Instead, they get intelligent, conversational interactions that leave them feeling heard and understood.

2. 24/7 availability

Voice AI is always on, providing customers with support at any time, day or night. Whether it’s handling routine inquiries or resolving issues, Voice AI ensures customers are never left waiting for help. This around-the-clock service not only boosts customer satisfaction, but also reduces the strain on human agents.

3. Operational efficiency

By automating high volumes of customer interactions, Voice AI significantly reduces human intervention, cutting costs. Agents can focus on more complex tasks, while Voice AI handles repetitive, time-consuming queries—making customer service teams more productive and focused.

4. Personalization at scale

By learning from each interaction, the system can continuously improve and deliver tailored responses to individual customers, offering a more personalized experience for every user. This level of personalization, once achievable only through human agents, is now possible on a much larger scale.
However, while machine learning plays a critical role in making these advancements possible, it is not a “magical” solution. The improvements happen over time, as the system processes more data and refines its understanding. Although this may sound simplified, the gradual and ongoing development of machine learning can indeed lead to highly effective and powerful outcomes in the long run.

The future of Voice AI: Next-gen experience in action

Voice AI’s future is already here, and it’s evolving faster than ever. Today’s systems are almost indistinguishable from human interactions, with conversations flowing naturally and seamlessly. But the leap forward doesn’t stop at just sounding more human—Voice AI is becoming smarter and more intuitive, capable of anticipating customer needs before they even ask. With AI-driven predictions, Voice AI can now suggest solutions, recommend next steps, and provide highly relevant information, all in real time.

Imagine a world where Voice AI understands customer’s speech and then anticipates what is needed next. Whether it’s guiding them through a purchase, solving a complex issue, or offering personalized recommendations, technology is moving toward a future where customer interactions are smooth, proactive, and entirely customer-centric.

For CX leaders, this opens up incredible opportunities to stay ahead of customer expectations. Those adopting next-gen Voice AI now are leading the charge in customer service innovation, offering cutting-edge experiences that set them apart from competitors. And as this technology continues to evolve, it will only get more powerful, more intuitive, and more essential for delivering world-class service.

The new CX frontier with Voice AI

As Voice AI continues to evolve from the simple Voice AI chatbot of yesteryear, we are entering a new frontier in customer experience. What started as a rigid, rules-based system has transformed into a dynamic, intelligent agent capable of revolutionizing how businesses engage with their customers. For CX leaders, this new era means greater personalization, enhanced efficiency, and the ability to meet customers where they are—whether it’s through voice, chat, or other digital channels.

We’ve made more progress in this development, but it is far from over. Voice AI is expanding, from deeper integrations with emerging technologies to more advanced predictive capabilities that can elevate customer experiences to new heights. The future holds more exciting developments, and staying ahead will require ongoing adaptation and willingness to embrace change.

Omnichannel capabilities is just the beginning

One fascinating capability of Voice AI is its ability to seamlessly integrate across multiple platforms, making it a truly omnichannel experience. For example, imagine you’re on a phone call with an AI agent, but due to background noise, it becomes difficult to hear. You could effortlessly switch to texting, and the conversation would pick up exactly where it left off in your text messages, without losing any context.

Similarly, if you’re on a call and need to share a photo, you can text the image to the AI agent, which can interpret the content of the photo and respond to it—all while continuing the voice conversation.

Another example of this multi-modal functionality is when you’re on a call and need to spell out something complex, like your last name. Rather than struggle to spell it verbally, you can simply text your name, and the Voice AI system will incorporate the information without disrupting the flow of the interaction. These types of seamless transitions between different modes of communication (voice, text, images) are what make multi-modal Voice AI truly revolutionary.

Voice AI’s future is already here, and it’s evolving rapidly. Today’s systems are approaching a level where they are almost indistinguishable from human interactions, with conversations flowing naturally and effortlessly. But the advancements go beyond merely sounding human—Voice AI is becoming smarter and more intuitive, capable of anticipating customer needs before they even express them. With AI-driven predictions, these systems can now suggest solutions, recommend next steps, and provide highly relevant information in real-time.

Imagine a scenario where Voice AI not only understands what a customer says, but also predicts what they might need next. Whether it’s guiding them through a purchase, solving a complex problem, or offering personalized product recommendations, this technology is leading the way toward a future where customer interactions are smooth, proactive, and deeply personalized.

For CX leaders, these capabilities open up unprecedented opportunities to exceed customer expectations. Those adopting next-generation Voice AI are positioning themselves at the forefront of customer service innovation, offering cutting-edge experiences that differentiate them from competitors. As this technology continues to advance, it will become even more powerful, more intuitive, and essential for delivering exceptional, customer-centric service.

Voice AI’s exciting road ahead

From the original Voice AI chatbot to today, Voice AI’s evolution has already transformed the customer experience—and the future promises continued innovation. From intelligent human-like conversations to predictive capabilities that anticipate needs, Voice AI is destined to change the way businesses interact with their customers in profound ways.

The exciting thing is that this is just the beginning.

The next wave of Voice AI advancements will open up new possibilities that we can only imagine. As a CX leader, the opportunity to harness this technology and stay ahead of customer expectations is within reach. It could be the most exciting time to be at the forefront of these changes.

At Quiq, we are here to guide you through this journey. If you’re curious about our Voice AI offering, we encourage you to watch our recent webinar on how we harness this incredible technology.

One thing is for sure, though: As the landscape continues to evolve, we’ll be right alongside you, helping you adapt, innovate, and lead in this new era of customer experience. Stay tuned, because the future of Voice AI is just getting started, and we’ll continue to share insights and strategies to ensure you stay ahead in this rapidly changing world.

National Furniture Retailer Reduces Escalations to Human Agents by 33%

A well-known furniture brand faced a significant challenge in enhancing their customer experience (CX) to stand out in a competitive market. By partnering with Quiq, they implemented a custom AI Agent to transform customer interactions across multiple platforms and create more seamless journeys. This strategic move resulted in a 33% reduction in support-related escalations to human agents.

On the other end of the spectrum, the implementation of Proactive AI and a Product Recommendation engine led to the largest sales day in the company’s history through increased chat sales, showcasing the power of AI in improving efficiency and driving revenue.

Let’s dive into the furniture retailer’s challenges, how Quiq solved them using next-generation AI, the results, and what’s next for this household name in furniture and home goods.

The challenges: CX friction and missed sales opportunities

A leading name in the furniture and home goods industry, this company has long been known for its commitment to quality and affordability. Operating in a sector often the first to signal economic shifts, the company recognized the need to differentiate itself through exceptional customer experience.

Before adopting Quiq’s solution, the company struggled with several CX challenges that impeded their ability to capitalize on customer interactions. To start, their original chatbot used basic natural language understanding (NLU), and failed to deliver seamless and satisfactory customer journeys.

Customers experienced friction, leading to escalations, redundant conversations. The team clearly needed a robust system that could streamline operations, reduce costs, and enhance customer engagement.

So, the furniture retailer sought a solution that could not only address these inefficiencies, but also support their sales organization by effectively capturing and routing leads.

The solution: Quiq’s next-gen AI

With a focus on enhancing every touch point of the customer journey, the furniture company’s CX team embarked on a mission to elevate their service offerings, making CX a primary differentiator. Their pursuit led them to Quiq, a trusted technology partner poised to bring their vision to life through advanced AI and automation capabilities.

Quiq partnered with the team to develop a custom AI Agent, leveraging the natural language capabilities of Large Language Models (LLMs) to help classify sales vs. support inquiries and route them accordingly. This innovative solution enables the company to offer a more sophisticated and engaging customer experience.

The AI Agent was designed to retrieve accurate information from various systems—including the company’s CRM, product catalog, and FAQ knowledge base—ensuring customers received timely, relevant, and accurate responses.

By integrating this AI Agent into webchat, SMS, and Apple Messages for Business, the company successfully created a seamless, consistent, and faster service experience.

The AI Agent also facilitated proactive customer engagement by using a new Product Recommendation engine. This feature not only guided customers through their purchase journey, but also contributed to a significant shift in sales performance.

The results are nothing short of incredible

The implementation of the custom AI Agent by Quiq has already delivered remarkable results. One of the most significant achievements was a 33% reduction in escalations to human agents. This reduction translated to substantial operational cost savings and allowed human agents to focus on complex or high-value interactions, enhancing overall service quality.

Moreover, the introduction of Proactive AI and the Product Recommendation engine led to unprecedented sales success. The furniture retailer experienced its largest sales day for Chat Sales in the company’s history, with an impressive 10% of total daily sales attributed to this channel for the first time.

This outcome underscored the potential of AI-powered solutions in driving business growth, optimizing efficiency, and elevating customer satisfaction.

Results recap:

  • 33% reduction in escalations to human agents.
  • 10% of total daily sales attributed to chat (largest for the channel in company history).
  • Tighter, smoother CX with Proactive AI and Product Recommendations woven into customer interactions.

What’s next?

The partnership between this furniture brand and Quiq exemplifies the transformative power of AI in redefining customer experience and achieving business success. By addressing challenges with a robust AI Agent, the company not only elevated its CX offerings, but also significantly boosted its sales performance. This case study highlights the critical role of AI in modern business operations and its impact on a company’s competitive edge.

Looking ahead, the company and Quiq are committed to continuing their collaboration to explore further AI enhancements and innovations. The team plans to implement Agent Assist, followed by Voice and Email AI to further bolster seamless customer experiences across channels. This ongoing partnership promises to keep the furniture retailer at the forefront of CX excellence and business growth.

What is LLM Function Calling and How Does it Work?

For all their amazing capabilities, LLMs have a fundamental weakness: they can’t actually do anything. They read a sequence of input tokens (the prompt) and produce a sequence of output tokens (one at a time) known as the completion. There are no side effects—just inputs and outputs. So something else, such as the application your building, has to take the LLM’s output and do something useful with it.

But how can we get an LLM to reliably generate output that conforms to our application’s requirements? Function calls, also known as tool usages, make it easier for your application to do something useful with an LLM’s output.

Note: LLM functions and tools generally refer to the same concept. ‘Tool’ is the term used by Anthropic/Claude, whereas OpenAI uses the term function as a specific type of tool. For purposes of this article, they are used interchangeably.

What Problem Does LLM Function Calling Solve?

To better understand the problem that function calls solve, let’s pretend we’re adding a new feature to an email client that allows the user to provide shorthand instructions for an email and use an LLM to generate the subject and body:

AI Email Generator

Our application might build up a prompt request like the following GPT-4o-Mini example. Note how we ask the LLM to return a specific format expected by our application:

user = “Kyle McIntyre”
recipient = “Aunt Suzie (suzieq@mailinator.com)”
user_input = "Tell her I can’t make it this Sunday, taking dog to vet. Ask how things are going, keep it folksy yet respectful."

prompt = f"""
Draft an email on behalf of the user, {user}, to {recipient}.

Here are the user’s instructions: {user_input}

Generate a subject and body. Format your response as JSON as follows:

{{
  "subject": <email subject,
  "body": <email body>
}}

Your response:
"""

request = {
  "model": "gpt-4o-mini-2024-07-18",
  "messages": [
    {
      "role": "user",
      "content": prompt
    }
  ],
  "response_format": {
    "type": "json_object"
  }
}

response = requests.post('https://api.openai.com/v1/chat/completions', auth=('Bearer', secret), json=request)

Assume our application sends this prompt and receives a completion back. What do we know about the completion? In a word: nothing.

Although LLMs do their best to follow our instructions, there’s no guarantee that the output will adhere to our requested schema. Subject and body could be missing, incorrectly capitalized, or perhaps be of the wrong type. Additional properties we didn’t ask for might also be included. Prior to the advent of function calls, our only options at this point were to

  • Continually tweak our prompts in an effort to get more reliable outputs
  • Write very tolerant deserialization and coercion logic in our app to make the LLM’s output adhere to our expectation
  • Retry the prompt multiple times until we receive legal output

Function calls, and a related model feature known as “structured outputs”, make all of this much easier and more reliable.

Function Calls to the Rescue

Let’s code up the same example using a function call. In order to get an LLM to ‘use’ a tool, you must first define it. Typically this involves giving it a name and then defining the schema of the function’s arguments.

In the example below, we define a tool named “draft_email” that takes two required arguments, body and subject, both of which are strings:

user = “Kyle McIntyre”
recipient = “Aunt Suzie (suzieq@mailinator.com)”
user_input = "Tell her I can’t make it this Sunday, taking dog to vet. Ask how things are going, keep it folksy yet respectful."

prompt = f"""
Use the available function to draft email on behalf of the user, {user}, to {recipient}.

Here are the user’s instructions: {user_input}
"""

tool = {
  "type": "function",
  "function": {
    "name": "draft_email",
    "description": "Draft an email on behalf of the user",
    "parameters": {
      "type": "object",
      "properties": {
        "subject": {
          "type": "string",
          "description": "The email subject",
        },
        "body": {
          "type": "string",
          "description": "The email body",
        }
      },
      "required": ["subject", "body"]
    }
  },
}

request = {
  "model": "gpt-4o-mini-2024-07-18",
  "messages": [
    {
      "role": "user",
      "content": prompt
    }
  ],
  "tools": [tool]
}

response = requests.post('https://api.openai.com/v1/chat/completions', auth=('Bearer', key), json=request)

Defining the tool required some extra work on our part, but it also simplified our prompt. We’re no longer trying to describe the shape of our expected output and instead just say “use the available function”. More importantly, we can now trust that the LLM’s output will actually adhere to our specified schema!

Let’s look at the response message we received from GPT-4o-Mini:

{
  "role": "assistant",
  "content": null,
  "tool_calls": [
    {
      "type": "function",
      "function": {
        "name": "draft_email",
        "arguments": "{\"subject\":\"Regrets for This Sunday\",\"body\":\"Hi Aunt Suzie,\\n\\nI hope this email finds you well! I wanted to let you know that I can't make it this Sunday, as I need to take the dog to the vet. \\n\\nHow have things been going with you? I always love hearing about what\u2019s new in your life.\\n\\nTake care and talk to you soon!\\n\\nBest,\\nKyle McIntyre\"}"
      }
    }
  ],
  "refusal": null
}

What we received back is really a request from the LLM to ‘call’ our function. Our application still needs to honor the function call somehow.

But now, rather than having to treat the LLMs output as an opaque string, we can trust that the arguments adhere to our application requirements. The ability to define a contract and trust that the LLM outputs will adhere to it make function calls an invaluable tool when integrating an LLM into an application.
How Does Function Calling Work?

As we saw in the last section, in order to get an LLM to generate reliable outputs we have to define a function or tool for it to use. Specifically, we’re defining a schema that the output needs to adhere to. Function calls and tools work a bit differently across various LLM vendors, but they all require the declaration of a schema and most are based on the open JsonSchema standard.

So, how does an LLM ensure that its outputs adhere to the tool schema? How can stochastic token-by-token output generation be reconciled with strict adherence to a data schema?

The solution is quite elegant: LLMs still generate their outputs one token at a time when calling a function, but the model is only allowed to choose from the subset of tokens that would keep the output in compliance with the schema. This is done through dynamic token masking based on the schema’s definition. In this way the output is still generative and very intelligent, but guaranteed to adhere to the schema.

Function Calling Misnomers and Misconceptions

The name ‘function call’ is somewhat misleading because it sounds like the LLM is going to actually do something on your behalf (and thereby cause side effects). But it doesn’t. When the LLM decides to ‘call’ a function, that just means that it’s going to generate output that represents a request to call that function. It’s still the responsibility of your application to handle that request and do something with it—but now you can trust the shape of the payload.

For this reason, a LLM function doesn’t need to map directly to any true function or method in your application, or any real API. Instead, LLM functions can (and probably should) be defined to be more conceptual from the perspective of the LLM.

Use in Agentic Workflows

So, are function calls only useful for constraining output? While that is certainly their primary purpose, they can also be quite useful in building agentic workflows. Rather than presenting a model with a single tool definition, you can instead present it with multiple tools and ask the LLM to use the tools at its disposal to help solve a problem.

For example, you might provide the LLM with the following tools in a CX context:

  • escalate() – Escalate the conversation to a human agent for further review
  • search(query) – Search a knowledgebase for helpful information
  • emailTranscript() – Email the customer a transcript of the conversation

When using function calls in an agentic workflow, the application typically interprets the function call and somehow uses it to update the information passed to the LLM in the next turn.

It’s also worth noting that conversational LLMs can call functions and generate output messages intended for the user all at the same time. If you were building an AI DJ, the LLM might call a function like play_track(“Traveler”, “Chris Stapleton”) while simultaneously saying to the user: “I’m spinning up one of your favorite Country tunes now”.

Function Calling in Quiq’s AI Studio

Function calling is fully supported in Quiq’s AI Studio on capable LLMs. However, AI Studio goes further than basic function call support in three key ways:

  1. The expected output shape of any prompt (the completion schema) can be visually configured in the Prompt Editor
  2. Prompt outputs aren’t just used for transient function calls but become attached to the visual flow state for inspection later in the same prompt chain or conversation
  3. Completion schemas can be configured on LLMs – even those that don’t support function calls

If you’re interested to learn more about AI Studio, please request a trial.

Why LLM Observability Matters (and Strategies for Getting it Right)

When integrating Large Language Models (LLMs) into applications, you can’t afford to treat them like “black boxes.” As your LLM application scales and becomes more complex, the need to monitor, troubleshoot, and understand how the LLM impacts your application becomes critical. In this article, we’ll explore the observability strategies we’ve found useful here at Quiq.

Key Elements of an Effective LLM Observability Strategy

  1. Provide Access: Encourage business users to engage actively in testing and optimization.
  2. Encourage Exploration: Make it easy to explore the application under different scenarios.
  3. Create Transparency: Clearly show how the model interacts within your application, reveal decision-making processes, system interactions, and how outputs are verified.
  4. Handle Errors Gracefully: Proactively identify and handle deviations or errors.
  5. Track System Performance: Expose metrics like response times, token usage, and errors.

LLMs add a layer of unpredictability and complexity to an application. Your observability tooling should allow you to actively explore both known and unknown issues while fostering an environment where engineers and business users can collaborate to create a new kind of application.

5 Strategies for LLM Observability

We will discuss strategies from the perspective of a real world event. An “event” triggers an application to process input and provides output back to the world.

A few examples of events include:

  • Chat user message input > Chat response
  • An email arriving into a ticketing system > Suggested reply
  • A case being closed > Case updated for topic or other classifications

You may have heard of these events referred to as prompt chains, prompt pipelines, agentic workflows, or conversational turns. The key takeaway; an event will require more than a single call to an LLM. Your LLM application’s job is to orchestrate LLM prompts, data requests, decisions and actions. The following strategies will help you understand what’s happening inside your LLM application.

1. Tracing Execution Paths

Any given event may follow different execution paths. Tracing the execution path should allow you to understand what state is set, which knowledge was retrieved, functions called, and generally how and why the LLM generated and verified the response. The ability to trace the execution path of an event will provide invaluable visibility into your application behavior.

For example, if your application delivers a message that offers a live agent; was it because the topic was sensitive, the user was frustrated or there was a gap in the knowledge resources? Tracing the execution path will help you pinpoint the prompt, knowledge or logic that drove the response. This is the first step in monitoring and optimizing an AI application. Your LLM observability should provide a full trace of the execution path that led to a response being delivered.

2. Replay Mechanisms for Faster Debugging

In real-world applications, being able to reproduce and fix errors quickly is critical. Implementing an event replay mechanism—where past events can be replayed against the current system configuration will provide a fast feedback loop.

Replaying events also helps when modifying prompts, upgrading models, adding knowledge or editing business rules. Changing your LLM application should be done in a controlled environment where you can replay events and ensure the desired effect without introducing new issues.

3. State Management & Monitoring

Another key aspect of LLM observability is capturing how your application’s field values or state changes during an event, as well as, across related events such as a conversation. Understanding the state of different variables can help you better understand and recreate the results of your LLM application.

Many use cases will also make use of memory. You should strive to manage this memory consistently and use caching for order or product info to reduce unnecessary network calls. In addition to data caches, multi-turn conversations may react differently based on the memory state. Suppose a user types “I need help” and you have implemented a next-best-action classifier with the following options:

  • Clarify the inquiry
  • Find Information
  • Escalate to live agent

The action taken may depend on whether “I need help” is the 1st or 5th message of the conversation. The response could also depend on whether the inquiry type is something you want your live agents handling.

The key takeaway – LLMs introduce a new kind of intelligence, but you’ll still need to manage state and domain specific logic to ensure your application is aware of its context. Clear visibility into the state of your application and your ability to reproduce it are vital parts of your observability strategy.

4. Claims Verification

A critical challenge with LLMs is ensuring the validity of the information they generate. Some refer to these made up answers as hallucinations. A hallucination is a statement made up by the LLM, usually because it makes semantic sense.

A claims verification process provides confidence that a response is grounded, attributable and verified by approved evidence from known knowledge or API resources. A dedicated verification model should be used to provide a confidence score and handling should be put in place to align answers that fail verification. The verification process should use metrics such as the maximum, minimum, and average scores and attribute answers to one or many resources.

For example:

  • On Verified: Define actions to take when a claim is verified. This could involve attributing the answer to one or many articles or API responses and then delivering a response to the end user.
  • On Unverified: Set workflows for unverified claims, such as retrying a prompt pipeline, aligning a corrective response, or escalating the issue to a human agent.

By integrating a claims verification model and process into your LLM application, you gain the ability to prevent hallucinations and attribute responses to known resources. This clear and traceable attribution will equip you with the information you need to field questions from stakeholders and provide insight into how you can improve your knowledge.

5. Regression Tests

After optimizing prompts, upgrading models, or introducing new knowledge; you’ll want to ensure that these changes don’t introduce new problems. Earlier, we talked about replaying events and this replay capability should be the basis for creating your test cases. You should be able to save any event as a regression test. Your test-sets should be run individually or in batch as part of a continuous integration pipeline.

The models are moving fast and your LLM application will be under constant pressure to get faster, smarter and cheaper. Test sets will give you the visibility and confidence you need to stay ahead of your competition.

Setting Performance Goals

While the above strategies are essential, it’s also important to evaluate how well your system is achieving its higher-level objectives. This is where performance goals come into play. Goals should be instrumented to track whether your application is successfully meeting the business objectives.

  • Goal Success: Measure how often your application achieves a defined objective, such as confirming an upcoming appointment, rendering an order status, or receiving positive user feedback.
  • Goal Failure: Track instances where the LLM fails to complete a task or requires human assistance.

Keep in mind that an event such as a live agent escalation could be considered success for one type of inquiry, and a failure in a different scenario. Goal instrumentation should provide a high degree of flexibility. By setting clear success and failure criteria for your application, you will be better positioned to evaluate its performance over time and identify areas for improvement.

Applying Segmentation to Hone In

Segmentation is a powerful tool for diving deeper into your LLM application’s performance. By grouping conversations or events based on specific criteria, such as inquiry type, user type or product category; you can focus your analysis on areas that matter most to your application.

For instance, you may want to segment conversations to see if your application behaves differently on web versus mobile, or across sales versus service inquiries. You can also create more complex segments that filter interactions based on specific events, such as when an error occurred or when a specific topic category was in play. Segmentation allows you to tailor your observability efforts to the use cases and specific needs of your business.

Using Funnels for Conversion and Performance Insights

Funnels provide another layer of insight by showing how users progress through a series of steps within a customer journey or conversation. A funnel allows you to visualize drop-offs, identify where users disengage, and track how many complete the intended goal. For example, you can track the steps a customer takes when engaging with your LLM application, from initial inquiry to task completion, and analyze where drop-offs occur.

Funnels can be segmented just like other data, allowing you to drill down by platform, customer type, or interaction type. This helps you understand where improvements are needed and how adjustments to prompts or knowledge bases can enhance the overall experience.

By combining segmentation with funnel analysis, you get a comprehensive view of your LLM’s effectiveness and can pinpoint specific areas for optimization.

A/B Testing for Continuous Improvement

A/B testing is a vital tool for systematically improving LLM application performance by comparing different versions of prompts, responses, or workflows. This method allows you to experiment with variations of the same interaction and measure which version produces better results. For instance, you can test two different prompts to see which one leads to more successful goal completions or fewer errors.

By running A/B tests, you can refine your prompt design, optimize the LLM’s decision-making logic, and improve overall user experience. The results of these tests give you data-backed insights, helping you implement changes with confidence that they’ll positively impact performance.

Additionally, A/B testing can be combined with funnel analysis, allowing you to track how changes affect customer behavior at each step of the journey. This ensures that your optimizations not only improve specific interactions but also lead to better conversion rates and task completions overall.

Final Thoughts on LLM Observability

LLM observability is not just a technical necessity but a strategic advantage. Whether you’re dealing with prompt optimization, function call validation, or auditing sensitive interactions, observability helps you maintain control over the outputs of your LLM application. By leveraging tools such as event debug-replay, regression tests, segmentation, funnel analysis, A/B testing, and claims verification, you will build trust that you have a safe and effective LLM application.

Curious about how Quiq approaches LLM observability? Get in touch with us.

Everything You Need to Know About LLM Integration

It’s hard to imagine an application, website or workflow that wouldn’t benefit in some way from the new electricity that is generative AI. But what does it look like to integrate an LLM into an application? Is it just a matter of hitting a REST API with some basic auth credentials, or is there more to it than that?

In this article, we’ll enumerate the things you should consider when planning an LLM integration.

Why Integrate an LLM?

At first glance, it might not seem like LLMs make sense for your application—and maybe they don’t. After all, is the ability to write a compelling poem about a lost Highland Cow named Bo actually useful in your context? Or perhaps you’re not working on anything that remotely resembles a chatbot. Do LLMs still make sense?

The important thing to know about ‘Generative AI’ is that it’s not just about generating creative content like poems or chat responses. Generative AI (LLMs) can be used to solve a bevy of other problems that roughly fall into three categories:

  1. Making decisions (classification)
  2. Transforming data
  3. Extracting information

Let’s use the example of an inbound email from a customer to your business. How might we use LLMs to streamline that experience?

  • Making Decisions
    • Is this email relevant to the business?
    • Is this email low, medium or high priority?
    • Does this email contain inappropriate content?
    • What person or department should this email be routed to?
  • Transforming data
    • Summarize the email for human handoff or record keeping
    • Redact offensive language from the email subject and body
  • Extracting information
    • Extract information such as a phone number, business name, job title etc from the email body to be used by other systems
  • Generating Responses
    • Generate a personalized, contextually-aware auto-response informing the customer that help is on the way
    • Alternatively, deploy a more sophisticated LLM flow (likely involving RAG) to directly address the customer’s need

It’s easy to see how solving these tasks would increase user satisfaction while also improving operational efficiency. All of these use cases are utilizing ‘Generative AI’, but some feel more generative than others.

When we consider decision making, data transformation and information extraction in addition to the more stereotypical generative AI use cases, it becomes harder to imagine a system that wouldn’t benefit from an LLM integration. Why? Because nearly all systems have some amount of human-generated ‘natural’ data (like text) that is no longer opaque in the age of LLMs.

Prior to LLMs, it was possible to solve most of the tasks listed above. But, it was exponentially harder. Let’s consider ‘is this email relevant to the business’. What would it have taken to solve this before LLMs?

  • A dataset of example emails labeled true if they’re relevant to the business and false if not (the bigger the better)
  • A training pipeline to produce a custom machine learning model for this task
  • Specialized hardware or cloud resources for training & inferencing
  • Data scientists, data curators, and Ops people to make it all happen

LLMs can solve many of these problems with radically lower effort and complexity, and they will often do a better job. With traditional machine learning models, your model is, at best, as good as the data you give it. With generative AI you can coach and refine the LLM’s behavior until it matches what you desire – regardless of historical data.

For these reasons LLMs are being deployed everywhere—and consumers’ expectations continue to rise.

How Do You Feel About LLM Vendor Lock-In?

Once you’ve decided to pursue an LLM integration, the first issue to consider is whether you’re comfortable with vendor lock-in. The LLM market is moving at lightspeed with the constant release of new models featuring new capabilities like function calls, multimodal prompting, and of course increased intelligence at higher speeds. Simultaneously, costs are plummeting. For this reason, it’s likely that your preferred LLM vendor today may not be your preferred vendor tomorrow.

Even at a fixed point in time, you may need more than a single LLM vendor.

In our recent experience, there are certain classification problems that Anthropic’s Claude does a better job of handling than comparable models from OpenAI. Similarly, we often prefer OpenAI models for truly generative tasks like generating responses. All of these LLM tasks might be in support of the same integration so you may want to look at the project not so much as integrating a single LLM or vendor, but rather a suite of tools.

If your use case is simple and low volume, a single vendor is probably fine. But if you plan to do anything moderately complex or high scale you should plan on integrating multiple LLM vendors to have access to the right models at the best price.

Resiliency & Scalability are Earned—Not Given

Making API calls to an LLM is trivial. Ensuring that your LLM integration is resilient and scalable requires more elbow grease. In fact, LLM API integrations pose unique challenges:

Challenge Solutions
They are pretty slow If your application is high-scale and you’re doing synchronous (threaded) network calls, your application won’t scale very well since most threads will be blocked on LLM calls. Consider switching to async I/O.

You’ll also want to support running multiple prompts in parallel to reduce visible latency to the user. 
They are throttled by requests per minute and tokens per minute Attempt to estimate your LLM usage in terms of requests and LLM tokens per minute and work with your provider(s) to ensure sufficient bandwidth for peak load 
They are (still) kinda flakey (unpredictable response times, unresponsive connections) Employ various retry schemes in response to timeouts, 500s, 429s (rate limit) etc.

The above remediations will help your application be scalable and resilient while your LLM service is up. But what if it’s down? If your LLM integration is on a critical execution path you’ll want to support automatic failover. Some LLMs are available from multiple providers:

  • OpenAI models are hosted by OpenAI itself as well as Azure
  • Anthropic models are hosted by Anthropic itself as well as AWS

Even if an LLM only has a single provider, or even if it has multiple, you can also provision the same logical LLM in multiple cloud regions to achieve a failover resource. Typically you’ll want the provider failover to be built into your retry scheme. Our failover mechanisms get tripped regularly out in production at Quiq, no doubt partially because of how rapidly the AI world is moving.

Are You Actually Building an Agentic Workflow?

Oftentimes you have a task that you know is well-suited for an LLM. For example, let’s say you’re planning to use an LLM to analyze the sentiment of product reviews. On the surface, this seems like a simple task that will require one LLM call that passes in the product review and asks the LLM to decide the sentiment. Will a single prompt suffice? What if we also want to determine if a given review contains profanity or personal information? What if we want to ask three LLMs and average their results?

Many tasks require multiple prompts, prompt chaining and possibly RAG (Retrieval Augmented Generation) to best solve a problem. Just like humans, AI produces better results when a problem is broken down into pieces. Such solutions are variously known as AI Agents, Agentic Workflows or Agent Networks and are why open source tools like LangChain were originally developed.

In our experience, pretty much every prompt eventually grows up to be an Agentic Workflow, which has interesting implications for how it’s configured & monitored.

Be Ready for the Snowball Effect

Introducing LLMs can result in a technological snowball effect, particularly if you need to use Retrieval Augmented Generation (RAG). LLMs are trained on mostly public data that was available at a fixed point in the past. If you want an LLM to behave in light of up-to-date and/or proprietary data sources (which most non-trivial applications do) you’ll need to do RAG.

RAG refers to retrieving the up-to-date and/or proprietary data you want the LLM to use in its decision making and passing it to the LLM as part of your prompt.

Assuming you need to search a reference dataset like a knowledge base, product catalog or product manual, the retrieval part of RAG typically entails adding the following entities to your system:

1. An embedding model

An embedding model is roughly half of an LLM – it does a great job of reading and understanding information you pass it but instead of generating a completion it produces a numeric vector that encodes its understanding of the source material.

You’ll typically run the embeddings model on all of the business data you want to search and retrieve for the LLM. Most LLM providers also have embedding models, or you can hit one via any major cloud.

2. A vector database

Once you have embeddings for all of your business data, you need to store them somewhere that facilitates speedy search based on numeric vectors. Solutions like Pinecone and MilvusDB fill this need, but that means integrating a new vendor or hosting a new database internally.

After implementing embeddings and a vector search solution, you can now retrieve information to include in the prompts you send to your LLM(s). But how can you trust that the LLM’s response is grounded in the information you provided and not something based on stale information or purely made up?

There are specialized deep learning models that exist solely for the purpose of ensuring that an LLM’s generative claims are grounded in facts you provide. This practice is variously referred to as hallucination detection, claim verification, NLI, etc. We believe NLI models are an essential part of a trustworthy RAG pipeline, but managed cloud solutions are scarce and you may need to host one yourself on GPU-enabled hardware.

Is a Black Box Sustainable?

If you bake your LLM integration directly into your app, you will effectively end up with a black box that can only be understood and improved by engineers. This could make sense if you have a decent size software shop and they’re the only folks likely to monitor or maintain the integration.

However, your best software engineers may not be your best (or most willing) prompt engineers, and you may wish to involve other personas like product and experience designers since an LLM’s output is often part of your application’s presentation layer & brand.

For these reasons, prompts will quickly need to move from code to configuration – no big deal. However, as an LLM integration matures it will likely become an Agentic Workflow involving:

  • More prompts, prompt parallelization & chaining
  • More prompt engineering
  • RAG and other orchestration

Moving these concerns into configuration is significantly more complex but necessary on larger projects. In addition, people will inevitably want to observe and understand the behavior of the integration to some degree.

For this reason it might make sense to embrace a visual framework for developing Agentic Workflows from the get-go. By doing so you open up the project to collaboration from non-engineers while promoting observability into the integration. If you don’t go this route be prepared to continually build out configurability and observability tools on the side.

Quiq’s AI Automations Take Care of LLM Integration Headaches For You

Hopefully we’ve given you a sense for what it takes to build an enterprise LLM integration. Now it’s time for the plug. The considerations outlined above are exactly why we built AI Studio and particularly our AI Automations product.

With AI automations you can create a serverless API that handles all the complexities of a fully orchestrated AI-flow, including support for multiple LLMs, chaining, RAG, resiliency, observability and more. With AI Automations your LLM integration can go back to being ‘just an API call with basic auth’.

Want to learn more? Dive into AI Studio or reach out to our team.

Request A Demo

How a Leading Office Supply Retailer Answered 35% More Store Associate Questions with Generative AI

In an era where artificial intelligence is rapidly transforming various industries, the retail sector is no exception. One leading national office supply retailer has taken a bold step forward, harnessing the power of generative AI to revolutionize their in-store experience and empower their associates.

This innovative approach has not only enhanced customer satisfaction but has also led to remarkable improvements in employee efficiency. In fact, the company has experienced a 35% increase in containment rates (with a 6-month average containment rate of 65%) vs. its legacy solution.

We’re excited to share the details of this groundbreaking initiative. Keep reading as we examine the company’s vision, their strategic approach to implementation, and the key objectives that drove their AI adoption. We’ll also discuss their GenAI assistant’s primary capabilities and how it’s improving both customer experiences and employee satisfaction. By the end, you’ll see how much potential lies in applying this use case to additional employees—not just in-store associates—as well as customers. There’s so much to unlock. Ready? Let’s dive in.

The Vision: Empowering Associates with GenAI

This company is dedicated to helping businesses of all sizes become more productive, connected, and inspired. Their team recognized the immense potential of GenAI early on. The vision? To create a GenAI-powered assistant that could enhance the capabilities of their store associates, leading to improved customer service, increased productivity, and higher job satisfaction.

Key objectives of the GenAI initiative:

  • Simplify store associate experience
  • Streamline access to information for associates
  • Improve customer service efficiency
  • Boost associate confidence and job satisfaction
  • Increase overall store associate productivity

Charting the Course to Building a GenAI-Powered Assistant

By partnering with Quiq, the national office supply retailer launched its employee-facing GenAI assistant in just 6 weeks. Here’s what the launch process looked like in 9 primary steps:

  1. Discovery of AI enhancements
  2. Pulling content from current systems
  3. Run a Proof of Concept with Quiq team
  4. Run testing through all categories of content
  5. Approval to Pilot with Top Associate Group
  6. Refine content based on associate feedback for chain rollout
  7. Run additional testing through all categories
  8. Starting chain deployment to larger district of stores
  9. Maintain content accuracy and refine based on updates

Examining the Office Supplier’s Phased Approach to Adoption

Pre-launch, the teams worked together to ensure all content was updated and accurate. Then they launched a phased testing approach, going through several rounds of iterative testing. After that, the retailer shared the GenAI assistant with a top internal associate team to test and try and break it. Finally, the internal team utilized a top associate group to share excitement before launch.

At launch, the office supplier created a standalone page dedicated to the assistant and launched a SharePoint site to share updates for the internal team. They also facilitated internal learning sessions and quickly adapted to low feedback numbers. Last but not least, the team made it fun by branding the assistant with a fun, on-brand name and personality.

Post-launch, the retailer includes the AI assistant in all communications to associates, with tips on what to search for in the assistant. They also leverage the assistant’s proactive messaging capabilities to build excitement for new launches, promotions, and best practices.

Primary Capabilities and Focus

Launching the GenAI assistant has been transformative because it is trained on all things related to the office supply retailer, which has simplified and accelerated access to information. That means associates can help customers faster, answering questions accurately the first time and every time, regardless of tenure. Ultimately, AI is empowering associates to do even better work—including enhanced cross and upselling with proactive messages.

Proactive messaging to associates helps keep rotating sales goals top of mind so they can weave additional revenue opportunities into customer interactions. For example, if the design services team has unexpected bandwidth, the AI assistant can send a message letting associates know, inspiring them to highlight design and print services to customers who may be interested. It also provides a fun countdown to important launches, like back-to-school season, and “fun facts” that help build up useful knowledge over time. It’s like bite-size bits of training.

GenAI Transforms the In-Store Experience in 4 Critical Ways

Implementing the GenAI assistant has had a profound impact on in-store operations. By providing associates with instant access to accurate information, it has:

  1. Enhanced Customer Service: Associates can now provide faster, more accurate responses to customer questions.
  2. Increased Efficiency: The time it takes to find information has been significantly reduced, allowing associates to serve more customers.
  3. Boosted Confidence: With a reliable AI assistant at their fingertips, associates feel more empowered in their roles. Plus, new associates can be as effective as experienced ones with the assistant by their side.
  4. Improved Job Satisfaction: The reduced stress of information retrieval has led to higher job satisfaction among associates. Not to mention, the GenAI assistant is there to converse and empathize with employees who experience stressful situations with customers.

Results + What’s Next?

As a result of launching its GenAI assistant with Quiq, our national office supply retailer customer has realized a:

  • 68% self service rate resolution rate, allowing associates to get immediate answers to questions 2 out of 3 times
  • Associate satisfaction with AI 4.82 out of 5

And as for next steps, the team is excited to:

  • Launch a selling assisted path
  • Expand to additional departments within stores
  • Add more devices in store for easier accessibility
  • Integrate with internal systems to be able to answer even more types of questions with real-time access to orders and other information

The Lesson: Humans and AI Can Work Together to Play Their Strongest Roles

The office supply retailer’s successful implementation of GenAI serves as a powerful example of how the technology can transform retail operations by helping human employees work more efficiently. By focusing on empowering associates with AI, the company has not only improved customer service but also enhanced employee satisfaction and productivity.

Interested in Diving Deeper into GenAI?

Download Two Truths and a Lie: Breaking Down the Major GenAI Misconceptions Holding CX Leaders Back. This comprehensive guide illuminates the path through the intricate landscape of generative AI in CX. We cut through the fog of misconceptions, offering crystal-clear, practical advice to empower your decision-making.