Introducing Email AI: How to Slash High Volumes While Improving CSAT Save your Spot →

Does GenAI Leak Your Sensitive Data? Exposing Common AI Misconceptions (Part Three)

This is the final post in a three-part series clarifying the biggest misconceptions holding CX leaders like you back from integrating GenAI into their CX strategies. Our goal? To assuage your fears and help you start getting real about adding an AI Assistant to your contact center — all in a fun “two truths and a lie” format.

There are few faux pas as damaging and embarrassing for brands as sensitive data getting into the wrong hands. So it makes sense that data security concerns are a major deterrent for CX leaders thinking about getting started with GenAI.

In the first post of our AI Misconceptions series, we discussed why your data is definitely good enough to make GenAI work for your business. Next, we explored the different types of hallucinations that CX leaders should be aware of, and how they are 100% preventable with the right guardrails in place.

Now, let’s wrap up our series by exposing the truth about GenAI potentially leaking your company or customer data.

Misconception #3: “GenAI inadvertently leaks sensitive data.”

As we discussed in part one, AI needs training data to work. One way to collect that data is from the questions users ask. For example, if a large language model (LLM) is asked to summarize a paragraph of text, that text could be stored and used to train future models.

Unfortunately, there have been some famous examples of companies’ sensitive information becoming part of datasets used to train LLMs — take Samsung, for instance. Because of this, CX leaders often fear that using GenAI will result in their company’s proprietary data being disclosed when users interact with these models.

Truth #1: Public GenAI tools use conversation data to train their models.

Tools like OpenAI’s ChatGPT and Google Gemini (formerly Bard) are public-facing and often free — and that’s because their purpose is to collect training data. This means that any information that users enter while using these tools is free game to be used for training future models.

This is precisely how the Samsung data leak happened. The company’s semiconductor division allowed its engineers to use ChatGPT to check their source code. Not only did multiple employees copy/paste confidential code into ChatGPT, but one team member even used the tool to transcribe a recording of an internal-only meeting!

Truth #2: Properly licensed GenAI is safe.

People often confuse ChatGPT, the application or web portal, with the LLM behind it. While the free version of ChatGPT collects conversation data, OpenAI offers an enterprise LLM that does not. Other LLM providers offer similar enterprise licenses that specify that all interactions with the LLM and any data provided will not be stored or used for training purposes.

When used through an enterprise license, LLMs are also Service Organization Control Type 2, or SOC 2, compliant. This means they have to undergo regular audits from third parties to prove that they have the processes and procedures in place to protect companies’ proprietary data and customers’ personally identifiable information (PII).

The Lie: Enterprises must use internally-developed models only to protect their data.

Given these concerns over data leaks and hallucinations, some organizations believe that the only safe way to use GenAI is to build their own AI models. Case in point: Samsung is now “considering building its own internal AI chatbot to prevent future embarrassing mishaps.”

However, it’s simply not feasible for companies whose core business is not AI to build AI that is as powerful as commercially available LLMs — even if the company is as big and successful as Samsung. Not to mention the opportunity cost and risk of having your technical resources tied up in AI instead of continuing to innovate on your core business.

It’s estimated that training the LLM behind ChatGPT cost upwards of $4 million. It also required specialized supercomputers and access to a data set equivalent to nearly the entire Internet. And don’t forget about maintenance: AI startup Hugging Face recently revealed that retraining its Bloom LLM cost around $10 million.

GenAI Misconceptions

Using a commercially available LLM provides enterprises with the most powerful AI available without breaking the bank— and it’s perfectly safe when properly licensed. However, it’s also important to remember that building a successful AI Assistant requires much more than developing basic question/answer functionality.

Finding a Conversational CX Platform that harnesses an enterprise-licensed LLM, empowers teams to build complex conversation flows, and makes it easy to monitor and measure Assistant performance is a CX leader’s safest bet. Not to mention, your engineering team will thank you for giving them optionality for the control and visibility they want—without the risk and overhead of building it themselves!

Feel Secure About GenAI Data Security

Companies that use free, public-facing GenAI tools should be aware that any information employees enter can (and most likely will) be used for future model-training purposes.

However, properly-licensed GenAI will not collect or use your data to train the model. Building your own GenAI tools for security purposes is completely unnecessary — and very expensive!

Want to read more or revisit the first two misconceptions in our series? Check out our full guide, Two Truths and a Lie: Breaking Down the Major GenAI Misconceptions Holding CX Leaders Back.

Will GenAI Hallucinate and Hurt Your Brand? Exposing Common AI Misconceptions (Part Two)

This is the second post in a three-part series clarifying the biggest misconceptions holding CX leaders like you back from integrating GenAI into their CX strategies. Our goal? To assuage your fears and help you start getting real about adding an AI Assistant to your contact center — all in a fun “two truths and a lie” format.

Did you know that the Golden Gate Bridge was transported for the second time across Egypt in October of 2016?

Or that the world record for crossing the English Channel entirely on foot is held by Christof Wandratsch of Germany, who completed the crossing in 14 hours and 51 minutes on August 14, 2020?

Probably not, because GenAI made these “facts” up. They’re called hallucinations, and AI hallucination misconceptions are holding a lot of CX leaders back from getting started with GenAI.

In the first post of our AI Misconceptions series, we discussed why your data is definitely good enough to make GenAI work for your business. In fact, you actually need a lot less data to get started with an AI Assistant than you probably think.

Now, we’re debunking AI hallucination myths and separating some of the biggest AI hallucination facts from fiction. Could adding an AI Assistant to your contact center put your brand at risk? Let’s find out.

Misconception #2: “GenAI will hallucinate and hurt my brand.”

While the example hallucinations provided above are harmless and even a little funny, this isn’t always the case. Unfortunately, there are many examples of times chatbots have cussed out customers or made racist or sexist remarks. This causes a lot of concern among CX leaders looking to use an AI Assistant to represent their brand.

Truth #1: Hallucinations are real (no pun intended).

Understanding AI hallucinations hinges on realizing that GenAI wants to provide answers — whether or not it has the right data. Hallucinations like those in the examples above occur for two common reasons.

AI-Induced Hallucinations Explained:

  1. The large language model (LLM) simply does not have the correct information it needs to answer a given question. This is what causes GenAI to get overly creative and start making up stories that it presents as truth.
  2. The LLM has been given an overly broad and/or contradictory dataset. In other words, the model gets confused and begins to draw conclusions that are not directly supported in the data, much like a human would do if they were inundated with irrelevant and conflicting information on a particular topic.

Truth #2: There’s more than one type of hallucination.

Contrary to popular belief, hallucinations aren’t just incorrect answers: They can also be classified as correct answers to the wrong questions. And these types of hallucinations are actually more common and more difficult to control.

For example, imagine a company’s AI Assistant is asked to help troubleshoot a problem that a customer is having with their TV. The Assistant could give the customer correct troubleshooting instructions — but for the wrong television model. In this case, GenAI isn’t wrong, it just didn’t fully understand the context of the question.

GenAI Misconceptions

The Lie: There’s no way to prevent your AI Assistant from hallucinating.

Many GenAI “bot” vendors attempt to fine-tune an LLM, connect clients’ knowledge bases, and then trust it to generate responses to their customers’ questions. This approach will always result in hallucinations. A common workaround is to pre-program “canned” responses to specific questions. However, this leads to unhelpful and unnatural-sounding answers even to basic questions, which then wind up being escalated to live agents.

In contrast, true AI Assistants powered by the latest Conversational CX Platforms leverage LLMs as a tool to understand and generate language — but there’s a lot more going on under the hood.

First of all, preventing hallucinations is not just a technical task. It requires a layer of business logic that controls the flow of the conversation by providing a framework for how the Assistant should respond to users’ questions.

This framework guides a user down a specific path that enables the Assistant to gather the information the LLM needs to give the right answer to the right question. This is very similar to how you would train a human agent to ask a specific series of questions before diagnosing an issue and offering a solution. Meanwhile, in addition to understanding what the intent of the customer’s question is, the LLM can be used to extract additional information from the question.

Referred to as “pre-generation checks,” these filters are used to determine attributes such as whether the question was from an existing customer or prospect, which of the company’s products or services the question is about, and more. These checks happen in the background in mere seconds and can be used to select the right information to answer the question. Only once the Assistant understands the context of the client’s question and knows that it’s within scope of what it’s allowed to talk about does it ask the LLM to craft a response.

But the checks and balances don’t end there: The LLM is only allowed to generate responses using information from specific, trusted sources that have been pre-approved, and not from the dataset it was trained on.

In other words, humans are responsible for providing the LLM with a source of truth that it must “ground” its response in. In technical terms, this is called Retrieval Augmented Generation, or RAG — and if you want to get nerdy, you can read all about it here!

Last but not least, once a response has been crafted, a series of “post- generation checks” happens in the background before returning it to the user. You can check out the end-to-end process in the diagram below:

RAG

Give Hallucination Concerns the Heave-Ho

To sum it up: Yes, hallucinations happen. In fact, there’s more than one type of hallucination that CX leaders should be aware of.

However, now that you understand the reality of AI hallucination, you know that it’s totally preventable. All you need are the proper checks, balances, and guardrails in place, both from a technical and a business logic standpoint.

Now that you’ve had your biggest misconceptions about AI hallucination debunked, keep an eye out for the next blog in our series, all about GenAI data leaks. Or, learn the truth about all three of CX leaders’ biggest GenAI misconceptions now when you download our guide, Two Truths and a Lie: Breaking Down the Major GenAI Misconceptions Holding CX Leaders Back.

Request A Demo

Is Your CX Data Good Enough for GenAI? Exposing Common AI Misconceptions (Part One)

If you’re feeling unprepared for the impact of generative artificial intelligence (GenAI), you’re not alone. In fact, nearly 85% of CX leaders feel the same way. But the truth is that the transformative nature of this technology simply can’t be ignored — and neither can your boss, who asked you to look into it.

We’ve all heard horror stories of racist chatbots and massive data leaks ruining brands’ reputations. But we’ve also seen statistics around the massive time and cost savings brands can achieve by offloading customers’ frequently asked questions to AI Assistants. So which is it?

This is the first post in a three-part series clarifying the biggest misconceptions holding CX leaders like you back from integrating GenAI into their CX strategies. Our goal? To assuage your fears and help you start getting real about adding an AI Assistant to your contact center — all in a fun “two truths and a lie” format. Prepare to have your most common AI misconceptions debunked!

Misconception #1: “My data isn’t good enough for GenAI.”

Answering customer inquiries usually requires two types of data:

  1. Knowledge (e.g. an order return policy) and
  2. Information from internal systems (e.g. the specific details of an order).

It’s easy to get caught up in overthinking the impact of data quality on AI performance and wondering whether or not your knowledge is even good enough to make an AI Assistant useful for your customers.

Updating hundreds of help desk articles is no small task, let alone building an entire knowledge base from scratch. Many CX leaders are worried about the amount of work it will require to clean up their data and whether their team has enough resources to support a GenAI initiative. In order for GenAI to be as effective as a human agent, it needs the same level of access to internal systems as human agents.

Truth #1: You have to have some amount of data.

Data is necessary to make AI work — there’s no way around it. You must provide some data for the model to access in order to generate answers. This is one of the most basic AI performance factors.

But we have good news: You need a lot less data than you think.

One of the most common myths about AI and data in CX is that it’s necessary to answer every possible customer question. Instead, focus on ensuring you have the knowledge necessary to answer your most frequently asked questions. This small step forward will have a major impact for your team without requiring a ton of time and resources to get started

Truth #2: Quality matters more than quantity.

Given the importance of relevant data in AI, a few succinct paragraphs of accurate information is better than volumes of outdated or conflicting documentation. But even then, don’t sweat the small stuff.

For example, did a product name change fail to make its way through half of your help desk articles? Are there unnecessary hyperlinks scattered throughout? Was it written for live agents versus customers?

No problem — the right Conversational CX Platform can easily address these AI data dependency concerns without requiring additional support from your team.

The Lie: Your data has to be perfectly unified and specifically formatted to train an AI Assistant.

Don’t worry if your data isn’t well-organized or perfectly formatted. The reality is that most companies have services and support materials scattered across websites, knowledge bases, PDFs, .csvs, and dozens of other places — and that’s okay!

Today, the tools and technology exist to make aggregating this fragmented data a breeze. They’re then able to cleanse and format it in a way that makes sense for a large language model (LLM) to use.

For example if you have an agent training manual in Google Docs and a product manual in PDF, this information can be disassembled, reformatted, and rewritten by an AI-powered transformation that makes it subsequently usable.

What’s more, the data used by your AI Assistant should be consistent with the data you use to train your human agents. This means that not only is it not required to build a special repository of information for your AI Assistant to learn from, but it’s not recommended. The very best AI platforms take on the work of maintaining this continuity by automatically processing and formatting new information for your Assistant as it’s published, as well as removing any information that’s been deleted.

Put Those Data Doubts to Bed

Now you know that your data is definitely good enough for GenAI to work for your business. Yes, quality matters more than quantity, but it doesn’t have to be perfect.

The technology exists to unify and format your data so that it’s usable by an LLM. And providing knowledge around even a handful of frequently asked questions can give your team a major lift right out the gate.

Keep an eye out for the next blog in our series, all about GenAI hallucinations. Or, learn the truth about all three of CX leaders’ biggest GenAI misconceptions now when you download our guide, Two Truths and a Lie: Breaking Down the Major GenAI Misconceptions Holding CX Leaders Back.

Request A Demo