Semi-Supervised Learning Explained (With Examples)

From movie recommendations to chatbots as customer service reps, it seems like machine learning (ML) is absolutely everywhere. But one thing you may not realize is just how much data is required to train these advanced systems, and how much time and energy goes into formatting that data appropriately.

Machine learning engineers have developed many ways of trying to cut down on this bottleneck, and one of the techniques that have emerged from these efforts is semi-supervised learning.

Today, we’re going to discuss semi-supervised learning, how it works, and where it’s being applied.

What is Semi-Supervised Learning?

Semi-supervised learning (SSL) is an approach to machine learning (ML) that is appropriate for tasks where you have a large amount of data that you want to learn from, only a fraction of which is labeled.

Semi-supervised learning sits somewhere between supervised and unsupervised learning, and we’ll start by understanding these techniques because that will make it easier to grasp how semi-supervised learning works.

Supervised learning refers to any ML setup in which a model learns from labeled data. It’s called “supervised” because the model is effectively being trained by showing it many examples of the right answer.

Suppose you’re trying to build a neural network that can take a picture of different plant species and classify them. If you give it a picture of a rose it’ll output the “rose” label, if you give it a fern it’ll output the “fern” label, and so on.

The way to start training such a network is to assemble many labeled images of each kind of plant you’re interested in. You’ll need dozens or hundreds of such images, and they’ll each need to be labeled by a human.

Then, you’ll assemble these into a dataset and train your model on it. What the neural network will do is learn some kind of function that maps features in the image (the concentrations of different colors, say, or the shape of the stems and leaves) to a label (“rose”, “fern”.)

One drawback to this approach is that it can be slow and extremely expensive, both in funds and in time. You could probably put together a labeled dataset of a few hundred plant images in a weekend, but what if you’re training something more complex, where the stakes are higher? A model trained to spot breast cancer from a scan will need thousands of images, perhaps tens of thousands. And not just anyone can identify a cancerous lump, you’ll need a skilled human to look at the scan to label it “cancerous” and “non-cancerous.”

Unsupervised learning, by contrast, requires no such labeled data. Instead, an unsupervised machine learning algorithm is able to ingest data, analyze its underlying structure, and categorize data points according to this learned structure.

Semi-supervised learning

Okay, so what does this mean? A fairly common unsupervised learning task is clustering a corpus of documents thematically, and let’s say you want to do this with a bunch of different national anthems (hey, we’re not going to judge you for how you like to spend your afternoons!).

A good, basic algorithm for a task like this is the k-means algorithm, so-called because it will sort documents into k categories. K-means begins by randomly initializing k “centroids” (which you can think of as essentially being the center value for a given category), then moving these centroids around in an attempt to reduce the distance between the centroids and the values in the clusters.

This process will often involve a lot of fiddling. Since you don’t actually know the optimal number of clusters (remember that this is an unsupervised task), you might have to try several different values of k before you get results that are sensible.

To sort our national anthems into clusters you’ll have to first pre-process the text in various ways, then you’ll run it through the k-means clustering algorithm. Once that is done, you can start examining the clusters for themes. You might find that one cluster features words like “beauty”, “heart” and “mother”, another features words like “free” and “fight”, another features words like “guard” and “honor”, etc.

As with supervised learning, unsupervised learning has drawbacks. With a clustering task like the one just described, it might take a lot of work and multiple false starts to find a value of k that gives good results. And it’s not always obvious what the clusters actually mean. Sometimes there will be clear features that distinguish one cluster from another, but other times they won’t correspond to anything that’s easily interpretable from a human perspective.

Semi-supervised learning, by contrast, combines elements of both of these approaches. You start by training a model on the subset of your data that is labeled, then apply it to the larger unlabeled part of your data. In theory, this should simultaneously give you a powerful predictive model that is able to generalize to data it hasn’t seen before while saving you from the toil of creating thousands of your own labels.

How Does Semi-Supervised Learning Work?

We’ve covered a lot of ground, so let’s review. Two of the most common forms of machine learning are supervised learning and unsupervised learning. The former tends to require a lot of labeled data to produce a useful model, while the latter can soak up a lot of hours in tinkering and yield clusters that are hard to understand. By training a model on a labeled subset of data and then applying it to the unlabeled data, you can save yourself tremendous amounts of effort.

But what’s actually happening under the hood?

Three main variants of semi-supervised learning are self-training, co-training, and graph-based label propagation, and we’ll discuss each of these in turn.

Self-training

Self-training is the simplest kind of semi-supervised learning, and it works like this.

A small subset of your data will have labels while the rest won’t have any, so you’ll begin by using supervised learning to train a model on the labeled data. With this model, you’ll go over the unlabeled data to generate pseudo-labels, so-called because they are machine-generated and not human-generated.

Now, you have a new dataset; a fraction of it has human-generated labels while the rest contains machine-generated pseudo-labels, but all the data points now have some kind of label and a model can be trained on them.

Co-training

Co-training has the same basic flavor as self-training, but it has more moving parts. With co-training you’re going to train two models on the labeled data, each on a different set of features (in the literature these are called “views”.)

If we’re still working on that plant classifier from before, one model might be trained on the number of leaves or petals, while another might be trained on their color.

At any rate, now you have a pair of models trained on different views of the labeled data. These models will then generate pseudo-labels for all the unlabeled datasets. When one of the models is very confident in its pseudo-label (i.e., when the probability it assigns to its prediction is very high), that pseudo-label will be used to update the prediction of the other model, and vice versa.

Let’s say both models come to an image of a rose. The first model thinks it’s a rose with 95% probability, while the other thinks it’s a tulip with a 68% probability. Since the first model seems really sure of itself, its label is used to change the label on the other model.

Think of it like studying a complex subject with a friend. Sometimes a given topic will make more sense to you, and you’ll have to explain it to your friend. Other times they’ll have a better handle on it, and you’ll have to learn from them.

In the end, you’ll both have made each other stronger, and you’ll get more done together than you would’ve done alone. Co-training attempts to utilize the same basic dynamic with ML models.

Graph-based semi-supervised learning

Another way to apply labels to unlabeled data is by utilizing a graph data structure. A graph is a set of nodes (in graph theory we call them “vertices”) which are linked together through “edges.” The cities on a map would be vertices, and the highways linking them would be edges.

If you put your labeled and unlabeled data on a graph, you can propagate the labels throughout by counting the number of pathways from a given unlabeled node to the labeled nodes.

Imagine that we’ve got our fern and rose images in a graph, together with a bunch of other unlabeled plant images. We can choose one of those unlabeled nodes and count up how many ways we can reach all the “rose” nodes and all the “fern” nodes. If there are more paths to a rose node than a fern node, we classify the unlabeled node as a “rose”, and vice versa. This gives us a powerful alternative means by which to algorithmically generate labels for unlabeled data.

Semi-Supervised Learning Examples

The amount of data in the world is increasing at a staggering rate, while the number of human-hours available for labeling it all is increasing at a much less impressive clip. This presents a problem because there’s no end to the places where we want to apply machine learning.

Semi-supervised learning presents a possible solution to this dilemma, and in the next few sections, we’ll describe semi-supervised learning examples in real life.

  • Identifying cases of fraud: In finance, semi-supervised learning can be used to train systems for identifying cases of fraud or extortion. Rather than hand-labeling thousands of individual instances, engineers can start with a few labeled examples and proceed with one of the semi-supervised learning approaches described above.
  • Classifying content on the web: The internet is a big place, and new websites are put up all the time. In order to serve useful search results it’s necessary to classify huge amounts of this web content, which can be done with semi-supervised learning.
  • Analyzing audio and images: This is perhaps the most popular use of semi-supervised learning. When audio files or image files are generated they’re often not labeled, which makes it difficult to use them for machine learning. Beginning with a small subset of human-labeled data, however, this problem can be overcome.

How Is Semi-Supervised Learning Different From…?

With all the different approaches to machine learning, it can be easy to confuse them. To make sure you fully understand semi-supervised learning, let’s take a moment to distinguish it from similar techniques.

Semi-Supervised Learning vs Self-Supervised Learning

With semi-supervised learning you’re training a model on a subset of labeled data and then using this model to process the unlabeled data. Self-supervised learning is different in that it’s showing an algorithm some fraction of the data (say the first 80 words in a paragraph) and then having it predict the remainder (the other 20 words in a paragraph.)

Self-supervised learning is how LLMs like GPT-4 are trained.

Semi-Supervised Learning vs Reinforcement Learning

One interesting subcategory of ML we haven’t discussed yet is reinforcement learning (RL). RL involves leveraging the mathematics of sequential decision theory (usually a Markov Decision Process) to train an agent to interact with its environment in a dynamic, open-ended way.

It bears little resemblance to semi-supervised learning, and the two should not be confused.

Semi-Supervised Learning vs Active Learning

Active learning is a type of semi-supervised learning. The big difference is that, with active learning, the algorithm will send its lowest-confidence pseudo-labels to a human for correction.

When Should You Use Semi-Supervised Learning?

Semi-supervised learning is a way of training ML models when you only have a small amount of labeled data. By training the model on just the labeled subset of data and using it in a clever way to label the rest, you can avoid the difficulty of having a human being label everything.

There are many situations in which semi-supervised learning can help you make use of more of your data. That’s why it has found widespread use in domains as diverse as document classification, fraud, and image identification.

So long as you’re considering ways of using advanced AI systems to take your business to the next level, check out our generative AI resource hub to go even deeper. This technology is changing everything, and if you don’t want to be left behind, set up a time to talk with us.

How Large Language Models Have Evolved

It seems as though large language models (LLMs) exploded into public awareness almost overnight. Relatively few people had heard of GPT-2, but I would venture to guess relatively few people haven’t heard of ChatGPT.

But like most things, language models have a history. And, in addition to being outrageously interesting, that history can help us reason about the progress in LLMs, as well as their likely future impacts.

Let’s get started!

A Brief History of Artificial Intelligence Development

The human fascination with building artificial beings capable of thought and action goes back a long way. Writing in roughly the 8th century BCE, Homer recounts tales of the god Hephaestus outsourcing repetitive manual tasks to automated bellows and working alongside robot-like “attendants” that were “…golden, and in appearance like living young women.”

No mere adornments, these handmaidens were described as having “intelligence in their hearts” and stirring “nimbly in support of their master” because “from the immortal gods they have learned how to do things.”

Some 500 years later, mathematicians in Alexandria would produce treatises on creating mechanical servants and various kinds of automata. Heron wrote a technical manual for producing a mechanical shrine and an automated theatre whose figurines could be activated to stage a full tragic play through an intricate system of cords and axles.

Nor is it only ancient Greece that tells similar tales. Jewish legends speak of the Golem, a being made of clay and imbued with life and agency through the use of language. The word “abracadabra”, in fact, comes from the Aramaic phrase avra k’davra, which translates to “I create as I speak.”

Through the ages, these old ideas have found new expression in stories such as “The Sorcerer’s Apprentice”, Mary Shelley’s “Frankenstein”, and Karel Čapek’s “R.U.R.”, a science fiction play that features the first recorded use of the word “robot”.

From Science Fiction to Science Fact

But they remained purely fiction until the early 20th Century, when advances in the theory of computation, as well as the development of primitive computers, began to offer a path toward actually building intelligent systems.

Arguably, the field of artificial intelligence really began in earnest with the 1950 publication of Alan Turing’s “Computing Machinery and Intelligence” – in which he proposed the famous “Turing test” – and with the 1956 Dartmouth conference on AI, organized by luminaries John McCarthy and Marvin Minsky.

People began taking AI seriously. Over the next ~50 years, there were numerous periods of hype and exuberance in which major advances were made, as well as long stretches, known as “AI winters”, in which funding dried up and little was accomplished.

Neural networks and the deep learning revolution are two advances that are particularly important for understanding how large language models have evolved over time, so it’s to these that we now turn.

Neural Networks And The Deep Learning Revolution

The groundwork for future LLM systems was laid by Walter Pitts and Warren McCulloch in the early 1940s. Inspired by the burgeoning study of the human brain, they wondered if it would be possible to build an artificial neuron that had the same basic properties as a biological one, i.e. it would activate and fire once a certain critical threshold had been crossed.

They were successful, though several other breakthroughs would be required before artificial neurons could be arranged into systems that were capable of doing useful work. One such breakthrough was backpropagation, the basic algorithm that is still used to train deep learning systems. Backpropagation was developed in 1960, and it uses the errors in a model’s outputs to iteratively adjust its internal parameters.

It wasn’t until 1985, however, that David Rumelhart, Ronald Williams, and Geoff Hinton used backpropagation in neural networks, and in 1989, this allowed Yann LeCun to train a convolutional neural network to recognize handwritten digits.

This was not the only architectural improvement that came out of this period. Especially noteworthy were the long short-term memory (LSTM) networks that were introduced in 1997 by Sepp Hochreiter and Jürgen Schmidhuber, which made it possible to learn more complex functions.

With these advances, it was clear that neural networks could be trained to do useful work, and that they were poised to do so. All that was left was to gather the missing piece: data.

The Big Data Era

Neural networks and deep-learning applications tend to be extremely data-hungry, and access to quality training data has always been a major bottleneck. In 2009 Stanford’s Fei-Fei Li sought to change this by releasing Imagenet, a database of over 14 million labeled images that could be used for free by researchers. The increase in available data, together with substantial improvements in computer hardware like graphical processing units (GPUs), meant that at long last the promise of deep learning could begin to be fulfilled.

And it was. In 2011 a convolutional neural network called “AlexNet” won multiple international competitions for image recognition, IBM’s Watson system beat several Jeopardy! all-stars in a real game, and Apple launched Siri. Amazon’s Alexa followed in 2014, and from 2015 to 2017 DeepMind’s AlphaGo shocked the world by utterly dominating the best human Go players.

Substantial strides were made in language models. In 2018 Google introduced its Bidirectional Encoder Representations from Transformers (BERT), a pre-trained model capable of a wide array of tasks, like text summarization, translation, and sentiment analysis.

One Model To Rule Them All

It would be easy to miss the significance of AlexNet’s performance on the ImageNet competition or BERT’s usefulness across multiple tasks. For a long time, it was anyone’s guess as to whether it would be possible to train a single large model on a dataset and use it for a range of purposes, or whether it would be necessary to train a multitude of models for each application.

From 2011 onwards, it has become clear that large, general-purpose models are often the best way to go. This point has only become more reinforced, with the success of GPT-4 in everything from brainstorming scientific hypotheses to handling customer service tasks.

How Has Large Language Model Performance Improved?

Now that we’ve discussed this history, we’re well-placed to understand why LLMs and generative AI have ignited so much controversy. People have been mulling over the promise (and peril) of thinking machines for literally thousands of years. After all that time it looks like they might be here, at long last.

But what, exactly, has people so excited? What is it that advanced AI tools are doing that has captured the popular imagination? In the following sections, we’ll talk about the astonishing (and astonishingly rapid) improvements that have been seen in language models in just a few short years.

Getting To Human-Level

One of the more surprising things about LLMs such as ChatGPT is just how good they are at so many different things. LLMs are trained with a technique known as “self-supervised learning”. They take random samples of the text data they’re given, and they try to predict what words come next given the words that came before.

Suppose the model sees the famous opening lines of Leo Tolstoy’s Ann Karenina: “Happy families are all alike; unhappy families are all unhappy in their own way.” What the model is trying to do is learn a function that will allow it to predict “in their own way” from “Happy families are all alike; unhappy families are all unhappy ___”.

The modern crop of LLMs can do this incredibly well, but what is remarkable is just how far this gets you. People are using generative AI to help them write poems, business plans, and code, create recipes based on the ingredients in their fridges, and answer customer questions.

Emergence in Language Models

Perhaps even more interesting, however, is the phenomenon of “emergence” in language models. When researchers tested LLMs on a wide variety of tasks meant to be especially challenging to these models – things like identifying a movie given a string of emojis or finding legal chess moves – they found that in about 5% of tasks, there is a sudden, sharp increase in ability on a given task once a model reaches a certain size.

At present, it’s not really clear how we should think about emergence. One hypothesis for emergence is that a big enough model is able to learn some general piece of knowledge not attainable by a smaller cousin, while another, more prosaic one is that it’s a relatively straightforward consequence of the model’s internal statistical machinery.

What’s more, it’s difficult to pin down the conditions required for emergence in language models. Though it generally appears to be a function of model size, there are cases in which the same abilities can be achieved with smaller models, or with models trained on very high-quality data, and emergence shows up at different scales for different models and tasks.

Whatever ends up being the case, it’s clear that this is a promising direction for future research. Much more work needs to be done to understand how precisely LLMs accomplish what they accomplish. This will not only redound upon the question of emergence, it will also inform the ongoing efforts to make language models safer and less biased.

The GPT Series

The big recent news in AI has, of course, been ChatGPT. ChatGPT has proven useful in an astonishingly-wide variety of use cases and is among the first powerful systems to have been made widely available to the public.

ChatGPT is part of a broader series of GPT models built by OpenAI. “GPT” stands for “generative pre-trained transformer”, and the first of its kind was developed back in 2018. New models and major updates have been released at a rapid clip ever since, culminating with GPT-4 coming out in March of 2023.

At present, OpenAI’s CEO Sam Altman has claimed that there are no current plans to train a successor GPT-5 model, but there are other companies, like DeepMind, who could plausibly build a competitor.

What’s Next For Large Language Models?

Given their flexibility and power, LLMs are finding use across a wide variety of industries, from software engineering to medicine to customer service.

If your interest has been piqued and you’d like to talk to an expert at Quiq about incorporating it into your business, reach out to us to schedule a demo!

Are Generative AI And Large Language Models The Same Thing?

The release of ChatGPT was one of the first times an extremely powerful AI system was broadly available, and it has ignited a firestorm of controversy and conversation.

Proponents believe current and future AI tools will revolutionize productivity in almost every domain.

Skeptics wonder whether advanced systems like GPT-4 will even end up being all that useful.

And a third group believes they’re the first sparks of artificial general intelligence and could be as transformative for life on Earth as the emergence of homo sapiens.

Frankly, it’s enough to make a person’s head spin. One of the difficulties in making sense of this rapidly-evolving space is the fact that many terms, like “generative AI” and “large language models” (LLMs), are thrown around very casually.

In this piece, our goal is to disambiguate these two terms by discussing ​​the differences between generative AI vs. large language models. Whether you’re pondering deep questions about the nature of machine intelligence, or just trying to decide whether the time is right to use conversational AI in customer-facing applications, this context will help.

Let’s get going!

What Is Generative AI?

Of the two terms, “generative AI” is broader, referring to any machine learning model capable of dynamically creating output after it has been trained.

This ability to generate complex forms of output, like sonnets or code, is what distinguishes generative AI from linear regression, k-means clustering, or other types of machine learning.

Besides being much simpler, these models can only “generate” output in the sense that they can make a prediction on a new data point.

Once a linear regression model has been trained to predict test scores based on number of hours studied, for example, it can generate a new prediction when you feed it the hours a new student spent studying.

But you couldn’t use prompt engineering to have it help you brainstorm the way these two values are connected, which you can do with ChatGPT.

There are many types of generative AI, so let’s spend a few minutes discussing the major categories: image generation, music generation, code generation, and a few others.

How Is Generative AI Used To Make Images?

One of the first “wow” moments in generative AI came fairly recently, when it was discovered that tools like Midjourney, DALL-E, and Stable Diffusion could create absolutely stunning images based on simple prompts like:

“Old man in a book store, ambient dappled sunlight, sedate, calm, close-up portrait.”

Depending on the wording you use, these images might be whimsical and futuristic, they might look like paintings from world-class artists, or they might look so photo-realistic you’d be convinced they’re about to start talking.

Created using DALL-E

Each of these tools is suited to specific applications. Midjourney seems to be best at capturing different artistic approaches and generating images that accurately capture an aesthetic. DALL-E tends to do better at depicting human figures, including faces and eyes. Stable Diffusion seems to do well at generating highly-detailed outputs, capturing subtleties like the way light reflects on a rain-soaked street.

(Note: these are all general impressions, it’s difficult to know how the tools will compare on any specific prompt.)

Broadly, this is known as “image synthesis”. And since we’re talking specifically about making images from text, this sub-domain is known as “text-to-image.”

A variant of this technique is text-to-video (alternatively: “text-to-4d”), which produces short clips or scenes based on text prompts. While text-to-video is still much more primitive than text-to-image, it will get better very quickly if recent progress in AI is any guide.

One interesting wrinkle in this story is that generative algorithms have generated something else along with images and animations: legal battles.

Earlier this year, Getty Images filed a lawsuit against the creators of Stable Diffusion, alleging that they trained their algorithm on millions of images from the Getty collection without getting permission first or compensating Getty in any way.

This has raised many profound questions about data rights, privacy, and how (or whether) people should be paid when their work is used to train a model that might eventually automate them out of a job.

We’re still in the early days of grappling with these issues, but they’re sure to make for fascinating case law in the years ahead.

How Is Generative AI Used To Make Music?

Given how successful advanced models have been in generating text (more on that shortly), it’s only natural to wonder whether similar models could also prove useful in generating music.

This is especially true because, on the surface, text and music share many obvious similarities (both are sequential, for example.) It would make sense, therefore, that the technical advances that have allowed coherent text production might also allow for coherent music production.

And they have! There are now a number of different tools, such as MusicLM, which are able to generate fairly high-quality audio tracks from prompts like:

“The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.”

As with using generative AI in images, creating artificial musical tracks in the style of popular artists has already sparked legal controversies. A particularly memorable example occurred just recently when a TikTok user supposedly created an AI-generated collaboration between Drake and The Weeknd, which then promptly went viral.

The track was removed from all major streaming services in response to backlash from artists and record labels, but it’s clear that this technology is going to change the way art is created in a major way.

How Is Generative AI Used For Coding?

It’s long been the dream of both programmers and non-programmers to simply be able to provide a computer with natural-language instructions (“build me a cool website”) and have the machine handle the rest. It would be hard to overstate the explosion in creativity and productivity this would initiate.

With the advent of code-generation models such as Replit’s Ghostwriter and GitHub Copilot, we’ve taken one more step towards that halcyon world.

As is the case with other generative models, code-generation tools are usually trained on massive amounts of data, after which point they’re able to take simple prompts and produce code from them.

You might ask it to write a function that converts between several different coordinate systems, create a web app that measures BMI, or translate from Python to Javascript.

As things stand now, the code is often incomplete in small ways. It might produce a function that takes an argument as input that is never used, for example, or which lacks a return function. Still, it is remarkable what has already been accomplished.

There are now software developers who are using models like ChatGPT all day long to automate substantial portions of their work, to understand new codebases with which they’re unfamiliar, or to write comments and unit tests.

What Are Large Language Models?

Now that we’ve covered generative AI, let’s turn our attention to large language models (LLMs).

LLMs are a particular type of generative AI.

Unlike with MusicLM or DALL-E, LLMs are trained on textual data and then used to output new text, whether that be a sales email or an ongoing dialogue with a customer.

(A technical note: though people are mostly using GPT-4 for text generation, it is an example of a “multimodal” LLM because it has also been trained on images. According to OpenAI’s documentation, image input functionality is currently being tested, and is expected to roll out to the broader public soon.)

What Are Examples of Large Language Models?

By far the most well-known example of an LLM is OpenAI’s “GPT” series, the latest of which is GPT-4. The acronym “GPT” stands for “Generative Pre-Trained Transformer”, and it hints at many underlying details about the model.

GPT models are based on the transformer architecture, for example, and they are pre-trained on a huge corpus of textual data taken predominately from the internet.

GPT, however, is not the only example of an LLM.

The BigScience Large Open-science Open-access Multilingual Language Model – known more commonly by its mercifully-short nickname, “BLOOM” – was built by more than 1,000 AI researchers as an open-source alternative to GPT.

BLOOM is capable of generating text in almost 50 natural languages, and more than a dozen programming languages. Being open-sourced means that its code is freely available, and no doubt there will be many who experiment with it in the future.

In March, Google announced Bard, a generative language model built atop its Language Model for Dialogue Applications (LaMDA) transformer technology.

As with ChatGPT, Bard is able to work across a wide variety of different domains, offering help with planning baby showers, explaining scientific concepts to children, or helping you make lunch based on what you already have in your fridge.

How Are Large Language Models Trained?

A full discussion of how large language models are trained is beyond the scope of this piece, but it’s easy enough to get a high-level view of the process. In essence, an LLM like GPT-4 is fed a huge amount of textual data from the internet. It then samples this dataset and learns to predict what words will follow given what words it has already seen.

At first, its performance will be terrible, but over time it will learn that a sentence like “I sat down on the _____” probably ends with a word like “floor” or “chair”, and probably not a word like “cactus” (at least, we hope you’re not sitting down on a cactus!)

When a model has been trained for long enough on a large enough dataset, you get the remarkable performance seen with tools like ChatGPT.

Is ChatGPT A Large Language Model?

Speaking of ChatGPT, you might be wondering whether it’s a large language model. ChatGPT is a special-purpose application built on top of GPT-3, which is a large language model. GPT-3 was fine-tuned to be especially good at conversational dialogue, and the result is ChatGPT.

Are All Large Language Models Generative AI?

Yes. To the best of our knowledge, all existing large language models are generative AI. “Generative AI” is an umbrella term for algorithms that generate novel output, and the current set of models is built for that purpose.

Utilizing Generative AI In Your Business

Though truly powerful generative AI models are less than a year old, they’re already being integrated into numerous business applications. Quiq Compose, for example, is able to study past interactions with customers to better tailor its future conversations to their particular needs.

From generating fake viral rap songs to generating photos that are hard to distinguish from real life, these powerful tools have already proven that they can dramatically speed up marketing, software development, and many other crucial business functions.

If you’re an enterprise wondering how you can use advanced AI technologies for applications like customer service, schedule a demo to see what the Quiq platform can offer you!

A Deep Dive on Large Language Models—And What They Mean For You

The release of OpenAI’s ChatGPT in late 2022 has utterly transformed the conversation around artificial intelligence. Whether it’s generating functioning web apps with just a few prompts, writing Spanish-language children’s stories about the blockchain in the style of Dr. Suess, or opining on the virtues and vices of major political figures, its ability to generate long strings of coherent, grammatically-correct text is shocking.

Seen in this light, it’s perhaps no surprise that ChatGPT has achieved such a staggering rate of growth. The application garnered a million users less than a week after its launch.

It’s believed that by January of 2023, this figure had climbed to 100 million monthly users, blowing past the adoption rates of TikTok (which needed nine months to get to this many monthly users) and Instagram (which took over two years.)

Naturally, many have become curious about the “large language model” (LLM) technology that makes ChatGPT and similar kinds of disruptive generative AI possible.

In this piece, we’re going to do a deep dive on LLMs, exploring how they’re trained, how they work internally, and how they might be deployed in your business. Our hope is that this will arm Quiq’s customers with the context they need to keep up with the ongoing AI revolution.

What Are Large Language Models?

LLMs are pieces of software with the ability to interact with and generate a wide variety of text. In this discussion, “text” is used very broadly to include not just existing natural language but also computer code.

A good way to begin exploring this subject is to analyze each of the terms in “large language model”, so let’s do that now.

LLMs Are Models.

In machine learning (ML), you can think of a model as being a function that maps inputs to outputs. Early in their education, for example, machine learning engineers usually figure out how to fit a linear regression model that does something like predict the final price of a house based on its square footage.

They’ll feed their model a bunch of data points that look like this:

House 1: 800 square feet, $120,000
House 2: 1000 square feet, $175,000
House 3: 1500 square feet, $225,000

And the model learns the relationship between square footage and price well enough to roughly predict the price of homes that weren’t in its training data.

We’ll have a lot more to say about how LLMs are trained in the next section. For now, just be aware that when you get down to it, LLMs are inconceivably vast functions that take the input you feed them and generate a corresponding output.

LLMs Are Large.

Speaking of vastness, LLMs are truly gigantic. As with terms like “big data”, there isn’t an exact, agreed-upon point at which a basic language model becomes a large language model. Still, they’re plenty big enough to deserve the extra “L” at the beginning of their name.

There are a few ways to measure the size of machine learning models, but one of the most common is by looking at their parameters.

In the linear regression model just discussed, there would be only one parameter, for square footage. We could make our model better by also showing it the home’s zip code and the number of bathrooms it has, and then it would have three parameters.

It’s hard to say how big most real systems are because that information isn’t usually made public, but a linear regression model might have dozens of parameters, and a basic neural network could range from a few hundred thousand to a few tens of millions of parameters.

GPT-3 has 175 billion parameters, and Google’s Minerva model has 540 billion parameters. It isn’t known how many parameters GPT-4 has, but it’s almost certainly more.

(Note: I say “almost” certainly because better models don’t always have more parameters. They usually do, but it’s not an ironclad rule.)

LLMs Focus On Language.

ChatGPT and its cousins take text as input and produce text as output. This makes them distinct from some of the image-generation tools that are on the market today, such as DALL-E and Midjourney.

It’s worth noting, however, that this might be changing in the future. Though most of what people are using GPT-4 to do revolves around text, technically, the underlying model is multimodal. This means it can theoretically interact with image inputs as well. According to OpenAI’s documentation, support for this feature should arrive in the coming months.

How Are Large Language Models Trained?

Like all machine learning models, LLMs must be trained. We don’t actually know exactly how OpenAI trained the latest GPT models, as they’ve kept those details secret, but we can make some broad comments about how systems like these are generally trained.

Before we get into technical details, let’s frame the overall task that LLMs are trying to perform as a guessing game. Imagine that I start a sentence and leave out the last word, asking you to provide a guess as to how it ends.

Some of these would be fairly trivial; everyone knows that “[i]t was the best of times, it was the worst of _____,” ends with the word “times.” Others would be more ambiguous; “I stopped to pick a flower, and then continued walking down the ____,” could plausibly end with words like “road”, “street”, or “trail.”

For still others, there’d be an almost infinite number of possibilities; “He turned to face the ___,” could end with anything from “firehose” to “firing squad.”

But how is it that you’re able to generate these guesses? How do you know what a good ending to a natural-language sentence sounds like?

The answer is that you’ve been “training” for this task your entire life. You’ve been listening to sentences, reading and writing sentences, or thinking in sentences for most of your waking hours, and have therefore developed a sense of how they work.

The process of training an LLM differs in many specifics, but at a high level, it’s learning to do the same thing. A model like GPT-4 is fed gargantuan amounts of textual data from the internet or other sources, and it learns a statistical distribution that allows it to predict which words come next.

At first, it’ll have no idea how to end the sentence “[i]t was the best of times, it was the worst of ____.” But as it sees more and more examples of human-generated textual content, it improves. It discovers that when someone writes “red, orange, yellow, green, blue, indigo, ______”, the next sequence of letters is probably “violet”. It begins to be more sensitive to context, discovering that the words “bat”, “diamond”, and “plate” are probably occurring in a discussion about baseball and not the weirdest Costco you’ve ever been to.

It’s precisely this nuance that makes advanced LLMs suitable for applications such as customer service.

They’re not simply looking up pre-digested answers to questions, they’re learning a function big enough to account for the subtleties of a specific customer’s specific problem. They still don’t do this job perfectly, but they’ve made remarkable progress, which is why so many companies are looking at integrating them.

Getting into the GPT-weeds

The discussion so far is great for building a basic intuition for how LLMs are trained, but this is a deep dive, so let’s talk technical specifics.

Though we don’t know much about GPT-4, earlier models like GPT and GPT-2 have been studied in great detail. By understanding how they work, we can cultivate a better grasp of cutting-edge models.

When an LLM is trained, it’s fed a great deal of text data. It will grab samples from this data, and try to predict the next token in its sample. To make our earlier explanation easier to understand we implied that a token is a word, but that’s not quite right. A token can be a word, an individual letter, or “sub words”, i.e. small chunks of letters and spaces.

This process is known as “self-supervised learning” because the model can assess its own accuracy by checking its predicted next token against the actual next token in the dataset it’s training on.

At first, its accuracy is likely to be very bad. But as it trains its internal parameters (remember those?) are tuned with an optimizer such as stochastic gradient descent, and it gets better.

One of the crucial architectural building blocks of LLMs is the transformer.

A full discussion of transformers is well beyond the scope of this piece, but the most important thing to know is that transformers can use “attention” to model more complex relationships in language data.

For example: in a sentence like “the dog didn’t chase the cat because it was too tired”, every human knows that “it” refers to the dog and not the cat. Earlier approaches to building language models struggled with such connections in sentences that were longer than a few words, but using attention, transformers can handle them with ease.

In addition to this obvious advantage, transformers have found widespread use in deep learning applications such as language models because they’re easy to parallelize, meaning that training times can be reduced.

Building On Top Of Large Language Models

Out-of-the-box LLMs are pretty powerful, but it’s often necessary to tweak them for specific applications such as enterprise bots. There are a few ways of doing this, and we’re going to confine ourselves to two major approaches: fine-tuning and prompt engineering.

First up, it’s possible to fine-tune some of these models. Fine-tuning an LLM involves providing a training set and letting the model update its internal weights to perform better on a specific task. 

Next, the emerging discipline of prompt engineering refers to the practice of systematically crafting the text fed to the model to get it to better approximate the behavior you want.

LLMs can be surprisingly sensitive to small changes in words, phrases, and context; the job of a prompt engineer, therefore, is to develop a feel for these sensitivities and construct prompts in a way that maximizes the performance of the LLM.

How Can Large Language Models Be Used In Business?

There is a new gold rush in applying AI to business use cases.

For starters, given how good they are at generating text, they’re being deployed to write email copy, blog posts, and social media content, to text or survey customers, and to summarize text.

LLMs are also being used in software development. Tools like Replit’s Ghostwriter are already dramatically improving developer productivity in a variety of domains, from web development to machine learning.

What Are The “LLiMitations” Of LLMs?

For all their power, LLMs have turned out to have certain well-known limitations. To begin with, LLMs are capable of being toxic, harmful, aggressive, and biased.

Though heroic efforts have been made to train this behavior out with techniques such as reinforcement learning from human feedback, it’s possible that it can reemerge under the right conditions.

This is something you should take into account before giving customers access to generative AI offerings.

Another oft-discussed limitation is the tendency of LLMs to “invent” facts. Remember, an LLM is just trying to predict sequences of tokens, and there’s no reason it couldn’t output a sequence of text like “Dr. Micha Sartorius, professor of applied computronics at Santa Grega University”, even though this person, field, and university are fictitious.

This, too, is something you should be cognizant of before letting customers interact with generative AI.

At Quiq, we harness the power of LLMs’ language-generating capabilities, while putting strict guardrails in place to prevent these risks that are inherent to public-facing generative AI.

Should You Be Using Large Language Models?

LLMs are a remarkable engineering achievement, having been trained on vast amounts of human text and able to generate whole conversations, working code, and more.

No doubt, some of the fervor around LLMs will end up being hype. Nevertheless, the technology has been shown to be incredibly powerful, and it is unlikely to go anywhere. If you’re interested in learning about how to integrate generative AI applications like Quiq’s into your business, schedule a demo with us today!

Prompt Engineering: What Is It—And How Can You Use It To Get The Most Out Of AI?

Think back to your school days. You come into class only to discover a timed writing assignment on the agenda. You have to respond to the provided prompt, quickly and accurately and will be graded against criteria like grammar, vocabulary, factual accuracy, and more.

Well, that’s what natural language processing (NLP) software like ChatGPT does daily. Except, when a computer steps into the classroom, it can’t raise its hand to ask questions.

That’s why it’s so important to provide AI with a prompt that’s clear and thorough enough to produce the best possible response.

What is prompt engineering?

A prompt can be a question, a phrase, or several paragraphs. The more specific the prompt is, the better the response.

Writing the perfect prompt — prompt engineering — is critical to ensure the NLP response is not only factually correct but crafted exactly as you intended to best deliver information to a specific target audience.

You can’t use low-quality ingredients in the kitchen to produce gourmet cuisine — and you can’t expect AI to, either.

Let’s revisit your old classroom again: did you ever have a teacher provide a prompt where you just weren’t really sure what the question was asking? So, you guessed a response based on the information provided, only to receive a low score.

In the post-exam review, the teacher explained what she was actually looking for and how the question was graded. You sat there thinking, “If I’d only had that information when I was given the prompt!”

Well, AI feels your pain.

The responses that NLP software provides are only as good as the input data. Learning how to communicate with AI to get it to generate desired responses is a science, and you can learn what works best through trial and error to continuously optimize your prompts.

Prompts that fail to deliver, and why.

What’s the root of the issue of prompt engineering gone wrong? It all comes down to incomplete, inconsistent, or incorrect data.

Even the most advanced AI using neural networks and deep learning techniques still needs to be fed the right information in the right way. When there is too little context provided, not enough examples, conflicting information from different sources, or major typos in the prompt, the AI can generate responses that are undesirable or just plain wrong.

How to craft the perfect prompt.

Here are some important factors to take into consideration for successful prompt engineering.

Clear instructions

Provide specific instructions and multiple examples to illustrate precisely what you want the AI to do. Words like “something,” “things,” “kind of,” and “it” (especially when there are multiple subjects within one sentence) can be indicators that your prompt is too vague.

Try to use descriptive nouns that refer to the subject of your sentence and avoid ambiguity.

  • Example (ambiguity): “She put the book on the desk; it was blue.”
  • What does “it” refer to in this sentence? Is the book blue, or is the desk blue?

Simple language

Use plain language, but avoid shorthand and slang. When in doubt, err on the side of overcommunicating and you can use trial and error to determine what shorthand approaches work for future, similar prompts. Avoid internal company or industry-specific jargon when possible, and be sure to clearly define any terms you may want to integrate.

Quality data

Give examples. Providing a single source of truth — for example, an article you want the AI to respond to questions about — will have a higher probability of returning factually correct responses based on the provided article.

On that note, teach the API how you want it to return responses when it doesn’t know the answer, such as “I don’t know,” “not enough information,” or simply “?”.

Otherwise, the AI may get creative and try to come up with an answer that sounds good but has no basis in reality.

Persona

Develop a persona for your responses. Should the response sound as though it’s being delivered by a subject matter expert or would it be better (legally or otherwise) if the response was written by someone who was only referring to subject matter experts (SMEs)?

  • Example (direct from SMEs): “Our team of specialists…”
  • Example (referring to SMEs): “Based on recent research by experts in the field…”

Voice, style, and tone

Decide how you want to represent your brand’s voice, which will largely be determined by your target audience. Would your customer be more likely to trust information that sounds like it was provided by an academic, or would a colloquial voice be more relatable?

Do you want a matter-of-fact, encyclopedia-type response, a friendly or supportive empathetic approach, or is your brand’s style more quick-witted and edgy?

With the right prompt, AI can capture all that and more.

Quiq takes prompt engineering out of the equation.

Prompt engineering is no easy task. There are many nuances to language that can trick even the most advanced NLP software.

Not only are incorrect AI responses a pain to identify and troubleshoot, but they can also hurt your business’s reputation if they aren’t caught before your content goes public.

On the other hand, manual tasks that could be automated with NLP waste time and money that could be allocated to higher-priority initiatives.

Quiq uses large language models (LLMs) to continuously optimize AI responses to your company’s unique data. With Quiq’s world-class Conversational AI platform, you can reduce the burden on your support team, lower costs, and boost customer satisfaction.

Contact Quiq today to see how our innovative LLM-built features improve business outcomes.

How To Be The Leader Of Personalized CX In Your Industry

Customer expectations are evolving alongside AI technology, at an unprecedented pace. People are more informed, connected, and demanding than ever before, and they expect nothing less than exceptional customer experiences (CX) from the brands they interact with.

This is where personalized customer experience comes in.

By tailoring CX to individual customers’ needs, preferences, and behaviors, businesses can create more meaningful connections, build loyalty, and drive revenue growth.
In this article, we will explore the power of personalized CX in industries and how it can help businesses stay ahead of the curve.

What is Personalized CX?

Personalized CX refers to the process of tailoring customer experiences to individual customers based on their unique needs, preferences, and behaviors. This involves using customer data and insights to create targeted and relevant interactions across multiple touchpoints, such as websites, mobile apps, social media, and customer service channels.

Personalization can take many forms, from simple tactics like using a customer’s name in a greeting to more complex strategies like recognizing that they are likely to be asking a question about the order that was delivered today. The goal is to create a seamless and consistent experience that makes customers feel valued and understood.

Why is Personalized CX Important?

Personalized CX has become increasingly important in industries for several reasons:

1. Rising Customer Expectations

Today’s customers expect personalized experiences across all industries, from retail and hospitality to finance and healthcare. In fact, according to a survey by Epsilon, 80% of consumers are more likely to do business with a company if it offers personalized experiences.

2. Increased Competition

As industries become more crowded and competitive, businesses need to find new ways to differentiate themselves. Personalized CX can help brands stand out by creating a unique and memorable experience that sets them apart from their competitors.

3. Improved Customer Loyalty and Retention

Personalized CX can help businesses build stronger relationships with their customers by creating a sense of loyalty and emotional connection. According to a survey by Accenture, 75% of consumers are more likely to buy from a company that recognizes them by name, recommends products based on past purchases, or knows their purchase history.

4. Increased Revenue

By providing personalized CX, businesses can also increase revenue by creating more opportunities for cross-selling and upselling. According to a study by McKinsey, personalized recommendations can drive 10-30% of revenue for e-commerce businesses.

Industries That Can Benefit From Personalized CX

Personalized CX can benefit almost any industry, but some industries are riper for personalization than others.

Here are some industries that can benefit the most from personalized CX:

1. Retail

Retail is one of the most obvious industries that can benefit from personalized CX. By using customer data and insights, retailers can create tailored product recommendations and personalized support based on products purchased and current order status.

2. Hospitality

In the hospitality industry, personalized CX can create a more memorable and enjoyable experience for guests. From personalized greetings to customized room amenities, hospitality businesses can use personalization to create a sense of luxury and exclusivity.

3. Healthcare

Personalized CX is also becoming increasingly important in healthcare. By tailoring healthcare experiences to individual patients’ needs and preferences, healthcare providers can create a more patient-centered approach that improves outcomes and satisfaction.

4. Finance

In the finance industry, personalized CX can help businesses create more targeted and relevant offers and services. By using customer data and insights, financial institutions can offer personalized recommendations for investments, loans, and insurance products.

Best Practices for Implementing Personalized CX in Industries

Implementing personalized CX requires a strategic approach and a deep understanding of customers’ preferences and behaviors.

Here are some best practices for implementing personalized CX in industries:

1. Collect and Use Customer Data Wisely

Collecting customer data is essential for personalized CX, but it’s important to do so in a way that respects customers’ privacy and preferences. Businesses should be transparent about the data they collect and how they use it, and give customers the ability to opt out of data collection.

2. Use Technology to Scale Personalization

Personalizing CX for every individual customer can be a daunting task, especially for large businesses. Using technology, such as machine learning algorithms and artificial intelligence (AI), can help businesses scale personalization efforts and make them more efficient.

3. Be Relevant and Timely

Personalized CX is only effective if it’s relevant and timely. Businesses should use customer data to create targeted and relevant offers, messages, and interactions that resonate with customers at the right time.

4. Focus on the Entire Customer Journey

Personalization shouldn’t be limited to a single touchpoint or interaction. To create a truly personalized CX, businesses should focus on the entire customer journey, from awareness to purchase and beyond.

5. Continuously Test and Optimize

Personalized CX is a continuous process that requires constant testing and optimization. Businesses should use data and analytics to track the effectiveness of their personalization efforts and make adjustments as needed.

Challenges of Implementing Personalized CX in Industries

While the benefits of personalized CX are clear, implementing it in industries can be challenging. Here are some of the challenges businesses may face:

1. Data Privacy and Security Concerns

Collecting and using customer data for personalization raises concerns about data privacy and security. Businesses must ensure they are following best practices for data collection, storage, and usage to protect their customers’ information.

2. Integration with Legacy Systems

Personalization requires a lot of data and advanced technology, which may not be compatible with legacy systems. Businesses may need to invest in new infrastructure and systems to support personalized CX.

3. Lack of Skilled Talent

Personalized CX requires a skilled team with expertise in data analytics, machine learning, and AI. Finding and retaining this talent can be a challenge for businesses, especially smaller ones.

4. Resistance to Change

Implementing personalized CX requires significant organizational change, which can be met with resistance from employees and stakeholders. Businesses must communicate the benefits of personalization and provide training and support to help employees adapt.

Personalized CX is no longer a nice-to-have; it’s a must-have for businesses that want to stay competitive in today’s digital age. By tailoring CX to individual customers’ needs, preferences, and behaviors, businesses can create more meaningful connections, build loyalty, and drive revenue growth. While implementing personalized CX in industries can be challenging, the benefits far outweigh the costs.

The Rise of Conversational AI: Why Businesses Are Embracing It

Movies may have twisted our expectations of artificial intelligence—either giving us extremely high expectations or making us think it’s ready to wipe out humanity.

But the reality isn’t on those levels. In fact, you’re already using AI in your daily life—but it’s so ingrained in your technology you probably don’t even notice. Netflix and Spotify both use AI to personalize your content recommendations. Siri, Alexa, and Google Assistant use it as well.

Conversational AI, like what Quiq uses to power our chatbots, takes artificial intelligence to the next level. See what it is and how you can use it in your business.

What is conversational AI?

Conversational artificial intelligence (AI) is a collection of technologies that create a human-like experience. It combines natural language processing (NLP), machine learning, and other technologies to enhance streamlined conversations. This can be used in many applications, like chatbots and voice (like Siri and Alexa). The most common use case for conversational AI in the business-to-customer world is through an AI chatbot messaging experience.

Unlike rule-based chatbots, those powered by conversational AI generate responses and adapt to user behavior over time. Rule-based chatbots were also limited to what you put in them—meaning if someone phrased a question differently than you wrote it (or used slang/colloquialisms/etc.), it wouldn’t understand the question. Conversational AI can also help chatbots understand more complex questions.

Putting technical terms in context.

Companies throw around a lot of technical terms when it comes to artificial intelligence, so here are what they mean and how they’re used to improve your business.

Rules-based chatbots: Earlier chatbot iterations (and some current low-cost versions) work mainly through pre-defined rules. Your business (or service provider) writes specific guidelines for the chatbot to follow. For example, when a customer says “Hi,” the chatbot responds, “Hello, how may I help you?”

Another example is when a customer asks about a return. The chatbot is programmed to give a specific response, like, “Here’s a link to the return policy.”

However, the problem with rule-based chatbots is that they can be limiting. It only knows how to handle situations based on the information programmed into it. So if someone says, “I don’t like this product, what can I do?” and you haven’t planned for that question, the chatbot won’t have a response.

Machine learning: Machine learning is a way to combat the problem posed above. Instead of giving the chatbot specific parameters complete with pre-written questions and answers, machine learning helps chatbots make decisions based on the information provided.

Machine learning helps chatbots adapt over time based on customer conversations. Instead of giving the bot specific ways to answer specific questions, you show it the basic rules, and it crafts its own response. Plus, since it means your chatbot is always learning, it gets better the longer you use it.

Natural language processing: As humans and speakers of the English language, we know that there are different ways to ask every question. For example, a customer who wants to know when an item is back in stock may ask, “When is X back in stock?” or they might say, “When will you get X back in?” or even, “When are you restocking X?” Those three questions all mean the same thing, and as humans, we naturally understand that. But a rules-based bot must be told that those mean the same things, or they might not understand it.

Natural language processing (NLP) uses AI technology to help chatbots understand that those questions are all asking the same thing. It also can determine what information it needs to answer your question, like color, size, etc.

NLP also helps chatbots answer questions in a more human-like way. If you want your chatbot to sound more human (and you should), then find one that uses NLP.

Web-based SDK: A web-based SDK (that’s a software development kit for non-developers) is a set of tools and resources developers use to integrate programs (in this case, chatbots) into websites and web-based applications.

What does this mean for your chatbot? Context. When a user says, “I need help with my order,” the chatbot can use NLP to identify “help” and “order.” Then it can look back at previous conversations, pull the customers’ order history, and more—if the data is there.

Contextual conversations are everything in customer service—so this is a big factor in building a successful chatbot using conversational AI. In fact, 70% of customers expect anyone they’re speaking with to have the full context. With a web-based SDK, your chatbot can do that too.

The benefits of conversational AI.

Using chatbots with conversational AI provides benefits across your business, but the clearest wins are in your contact center. Here are three ways chatbots improve your customer service.

24/7 customer support.

Your customer service agents need to sleep, but your conversational AI chatbot doesn’t. A chatbot can answer questions and contain customer issues while your contact center is closed. Any issues they can’t solve, they can pass along to your agents the next day. Not only does that give your customers 24/7 service, but your agents will have less of a backlog when they return to work.

Faster response times.

When your agents are inundated with customers, an AI chatbot can pick up the slack. Send your chatbot in to greet customers immediately, let them know the wait time, or even start collecting information so your agents can get to the root of the problem faster. Chatbots powered with AI can also answer questions and solve easy customer issues, skipping human agents altogether.

For more ways AI chatbots can improve your customer service, read this >

More present customer service agents.

Chatbots can handle low-level customer queries and give agents the time and space to handle more complex issues. Not only will this result in better customer service, but agents will be happier and less stressed overall.

Plus, chatbots can scale during your busy seasons. You’ll save on costs since you won’t have to hire more agents, and the agents you have won’t be overworked.

How to make the most of AI technology.

Unfortunately, you can’t just plug and play with conversational AI and expect to become an AI company. Just like any other technology, it takes prep work and thoughtful implementation to get it right—plus lots of iterations.

Use these tips to make the most of AI technology:

Decide on your AI goals.

How are you planning on using conversational AI? Will it be for marketing? Customer service? All of the above? Think about what your main goals are and use that information to select the right AI partner.

Choose the right conversational AI platform.

Once you’ve decided on how you want to use conversational AI, select the right partner to help you get there. Think about aspects like ease of use, customization, scalability, and budget.

Design your chatbot interactions.

Even with artificial intelligence, you still have to put the work in upfront. What you do and how you do it will vary greatly depending on which platform you go with. Design your chatbot conversations with these things in mind:

  • Your brand voice
  • Personalization
  • Customer service best practices
  • Logical conversation flows
  • Concise messages

Build a partnership between agents and chatbots.

Don’t launch the chatbot independently of your customer service agents. Include them in the training and launch, and start to build a working relationship between the two. Agents and chatbots can work together on customer issues, both popping in and out of the conversation seamlessly. For example, a chatbot can collect information from the customer upfront and pass it to the agent to solve the issue. Then, when the agent is done, they can bring the chatbot back in to deliver a customer survey.

Test and refine.

Sometimes, you don’t know what you don’t know until it happens. Test your chatbot before it launches, but don’t stop there. Keep refining your conversations even after you’ve launched.

What does the future hold for conversational AI?

There are many exciting things happening in AI right now, and we’re only on the cusp of delving into what it can really do.

The big prediction? For now, conversational AI will keep getting better at what it’s already doing. More human-like interactions, better problem-solving, and more in-depth analysis.

In fact, 75% of customers believe AI will become more natural and human-like over time. Gartner is also predicting big things for conversational AI, saying by 2026, conversational AI deployments within contact centers will reduce agent labor costs by $80 billion.

Why should you jump in now when bigger things are coming? It’s simple. You’ll learn to master conversational AI tools ahead of your competitors and earn an early competitive advantage.

How Quiq does conversational AI.

To ensure you give your customers the best experience, Quiq powers our entire platform with conversational AI. Here are a few stand-out ways Quiq uniquely improves your customer service with conversational AI.

Design customized chatbot conversations.

Create chatbot conversations so smooth and intuitive that it feels like you’re talking to a real person. Using the best conversational AI techniques, Quiq’s chatbot gives customers quick and intelligent responses for an up-leveled customer experience.

Help your agents respond to customers faster.

Make your agents more efficient with Quiq Compose. Quiq Compose uses conversational AI to suggest responses to customer questions. How? It uses information from similar conversations in the past to craft the best response.

Empower agent performance.

Tools like our Adaptive Response Timer (ADT) prioritizes conversations based on how fast or slow customers respond. The conversational AI platform also uses AI to analyze customer sentiment to give extra attention to customers who need it.

This is just the beginning.

This is just a taste of what conversational AI can do. See how Quiq can apply the latest technology to your contact center to help you deliver exceptional customer service.

Contact Us

Quiq Compose: Learning the Language of your Contact Center

Hi! I’m Kyle, Head of AI Engineering at Quiq, and I’m excited to share our latest product with you: Quiq Compose!

What is Quiq Compose?

Quiq Compose is generative AI technology that provides your agents with adaptive, contextually relevant response suggestions. The result? Agents spend less time typing and more time helping customers!

How does it work?

Compose learns by studying past conversations. Every message sent by live agents represents a teachable moment for AI. The AI’s job is to learn a mapping from the context in which the agent authored the message to the message content that was ultimately sent.

The context refers not only to prior messages in the conversation (including outbound notifications and prior bot interactions), but also to important non-conversational data like whether or not we know the customer’s email address, the time of year, and more. By providing Compose with a complete view of the context, we enable it to generate more accurate responses.

How does it compare to other LLM technology?

Since the release of ChatGPT, the world has been abuzz about LLMs and their capabilities. Compose uses the same underlying technology as ChatGPT (transformers), but there are some important differences to consider:

  • Compose has a much smaller scope of language that it needs to learn compared to a general-purpose model like ChatGPT. This enables us to train AI that is more lightweight and cost-effective
  • Compose is trained specifically on your data with the option to skew training to act more like some agents and less like others. It will learn important phone numbers and URLs that a general-purpose LLM won’t know about
  • Compose has a more explicit understanding of rich messaging concepts (e.g. payment messages) and non-conversational (CRM) data
  • General LLMs may exhibit overconfidence as a result of their eagerness to complete your prompt, whereas Compose simply remains silent in situations where it’s unconfident
  • Compose doesn’t require an integration. It can learn from any conversations that occur within the Quiq platform.
  • Compose has important enterprise features such as full control over the AI’s vocabulary, support for the isolation of different brands within a single business, SOC2 compliance, and more.

In short, Compose is laser-focused on learning the language of your contact center and streamlining live agent workflows in digital CX.

The Journey

At Quiq, we’re always working hard to bridge the gap between the latest AI breakthroughs and cost-effective solutions that are ready for enterprise deployment, and Compose is no exception. We’ve spent more than a year adapting cutting-edge AI algorithms to digital CX and rich messaging use cases and are proud of the impact it’s had on our customers to date.

Learn more about Compose here.

Contact Us

7 Ways AI Chatbots Improve Customer Service

If you’ve been using business messaging for a while, you know easy and convenient it is for your customers—and its impact on your customer service team’s output.

With Quiq’s robust messaging platform, it’s easy for contact centers to manage customer conversations while boosting conversion rates, increasing engagement, and reducing costs. But our little slice of digital nirvana only gets better when you add chatbots into the mix.

Enter the business messaging bot. Bots can help increase your agent productivity while delivering an even better customer experience.

We’re diving into seven times business messaging bots made a customer conversation faster and better.

1. Collect customer information upfront.

Let’s say, for example, you own an airline with a great reward program. With Quiq, you can create a bot that greets all your customers right away and asks them to enter their rewards number if they have one.

This “reward bot” will use the information gathered to help recognize platinum-status members—your most elite program. The reward bot reroutes platinum members to a special VIP queue where wait times are shorter and they receive higher support. This is done consistently and without hesitation. Your platinum members don’t have to wade through the customer service queue. It makes them feel more valued and more likely to continue flying with you in the future.

The reward bot can also collect other information, such as confirmation numbers for reservations, updated email addresses, or contact numbers. All of this data gathering can be done before a human agent steps into the conversation. The support chatbot has done the work to arm the agent with the information they need to deliver better service.

2. Decrease customer abandonment.

Acknowledging customers with a fast, friendly greeting lets them know they’ve started on a path to resolution. Agents may be busy with other conversations (we’ve seen agents handle upwards of eight at a time), but that doesn’t mean the customer can’t start engaging with your business. A support chatbot can greet customers immediately while agents are busy.

Instead of waiting in a stagnant queue over the phone or trying to talk to a live chat agent (also known as web chat) who has disappeared, a bot can send a welcome message and let the customer know when they’ll receive a response from a human agent.

3. Get faster, more accurate customer responses.

Remember the last time you had to spell your name out over the phone or repeat your birthday again and again because the call bot couldn’t pick it up? Conversational chatbots eliminate that frustration and ensure it collects fast and accurate information from the customer every time.

Over messaging, the customer can see the data they’re providing and confirm right away if there’s an error. The customer can at least reference the information and catch any typos in their email address or that they’ve provided their old phone number. It happens.

4. Prioritize customer conversations.

In our above example, the reward bot was able to recognize platinum rewards members so they could get the perks that came with their membership. Chatbots can help you prioritize conversations in other ways too.

For example, you can set rules within Quiq to recognize keywords such as “buy” or “purchase” to prioritize customers who may need help with a transaction. Depending on the situation, the platform can prioritize that conversation (likely with high purchase intent) over a password reset or return.

A chatbot platform like Quiq can also use natural language processing (NLP) to predict customer sentiment and prioritize based on that. That way, you can identify a frustrated customer and bump them up in the queue to handle the problem before it escalates.

Contact Us

5. Get customers to the right place.

Chatbots can help route customers to the appropriate department, agent, or even another support bot for help. Much like a call routing system (but more sophisticated), a chatbot can identify a customer’s problem and save them from bouncing around between support agents.

The simplest example is when a bot greets customers and asks, “What can I help you with today?” The bot can either present the user with several options or let them state their problem. A customer can then be routed directly to the support agent best fit for solving their problem.

This also eliminates the need for customers to repeat themselves at each step of the way. Instead of having to explain their situation to the call router and then again to the service agent, the chatbot hands off the messages to the human agent. The agent already knows the problem and can start searching for a solution right away.

6. Reschedule appointments.

Appointment scheduling and rescheduling is a time-consuming and frustrating process. Chatbots can help you reduce delays, ensuring customers avoid back-and-forth emails and long hold times just to move an appointment.

With Quiq business messaging, you can present customers with available dates and times. Customers can choose and confirm a date from available calendar options.

A support chatbot with the right integrations can help present customers with available dates to choose from and schedule the selected appointment.

7. Collect feedback for even more improvement.

Businesses shouldn’t underestimate the power of feedback. Believing you know what customers want and actually asking them can lead to completely different results. Yet, the biggest roadblock to collecting feedback is distributing the survey at the moment when it counts.

A support chatbot can ensure every customer service interaction is followed up with a survey. You can program the bot to send unique surveys based on the conversation and get specific feedback on the spot. Collecting that survey information and putting it into place will help your team improve.

Take the Leap with Quiq.

Implementing customer service chatbots within your organization may seem intimidating now, but Quiq can help you navigate it. We can help you orchestrate bots throughout your organization, whether you need one or many.

With Quiq, you can design conversational experiences your customers will love. Once you create a bot, you can run it across all of our supported channels to deliver a consistent experience no matter where your customers are.