Semi-Supervised Learning Explained (With Examples)

From movie recommendations to chatbots as customer service reps, it seems like machine learning (ML) is absolutely everywhere. But one thing you may not realize is just how much data is required to train these advanced systems, and how much time and energy goes into formatting that data appropriately.

Machine learning engineers have developed many ways of trying to cut down on this bottleneck, and one of the techniques that have emerged from these efforts is semi-supervised learning.

Today, we’re going to discuss semi-supervised learning, how it works, and where it’s being applied.

What is Semi-Supervised Learning?

Semi-supervised learning (SSL) is an approach to machine learning (ML) that is appropriate for tasks where you have a large amount of data that you want to learn from, only a fraction of which is labeled.

Semi-supervised learning sits somewhere between supervised and unsupervised learning, and we’ll start by understanding these techniques because that will make it easier to grasp how semi-supervised learning works.

Supervised learning refers to any ML setup in which a model learns from labeled data. It’s called “supervised” because the model is effectively being trained by showing it many examples of the right answer.

Suppose you’re trying to build a neural network that can take a picture of different plant species and classify them. If you give it a picture of a rose it’ll output the “rose” label, if you give it a fern it’ll output the “fern” label, and so on.

The way to start training such a network is to assemble many labeled images of each kind of plant you’re interested in. You’ll need dozens or hundreds of such images, and they’ll each need to be labeled by a human.

Then, you’ll assemble these into a dataset and train your model on it. What the neural network will do is learn some kind of function that maps features in the image (the concentrations of different colors, say, or the shape of the stems and leaves) to a label (“rose”, “fern”.)

One drawback to this approach is that it can be slow and extremely expensive, both in funds and in time. You could probably put together a labeled dataset of a few hundred plant images in a weekend, but what if you’re training something more complex, where the stakes are higher? A model trained to spot breast cancer from a scan will need thousands of images, perhaps tens of thousands. And not just anyone can identify a cancerous lump, you’ll need a skilled human to look at the scan to label it “cancerous” and “non-cancerous.”

Unsupervised learning, by contrast, requires no such labeled data. Instead, an unsupervised machine learning algorithm is able to ingest data, analyze its underlying structure, and categorize data points according to this learned structure.

Semi-supervised learning

Okay, so what does this mean? A fairly common unsupervised learning task is clustering a corpus of documents thematically, and let’s say you want to do this with a bunch of different national anthems (hey, we’re not going to judge you for how you like to spend your afternoons!).

A good, basic algorithm for a task like this is the k-means algorithm, so-called because it will sort documents into k categories. K-means begins by randomly initializing k “centroids” (which you can think of as essentially being the center value for a given category), then moving these centroids around in an attempt to reduce the distance between the centroids and the values in the clusters.

This process will often involve a lot of fiddling. Since you don’t actually know the optimal number of clusters (remember that this is an unsupervised task), you might have to try several different values of k before you get results that are sensible.

To sort our national anthems into clusters you’ll have to first pre-process the text in various ways, then you’ll run it through the k-means clustering algorithm. Once that is done, you can start examining the clusters for themes. You might find that one cluster features words like “beauty”, “heart” and “mother”, another features words like “free” and “fight”, another features words like “guard” and “honor”, etc.

As with supervised learning, unsupervised learning has drawbacks. With a clustering task like the one just described, it might take a lot of work and multiple false starts to find a value of k that gives good results. And it’s not always obvious what the clusters actually mean. Sometimes there will be clear features that distinguish one cluster from another, but other times they won’t correspond to anything that’s easily interpretable from a human perspective.

Semi-supervised learning, by contrast, combines elements of both of these approaches. You start by training a model on the subset of your data that is labeled, then apply it to the larger unlabeled part of your data. In theory, this should simultaneously give you a powerful predictive model that is able to generalize to data it hasn’t seen before while saving you from the toil of creating thousands of your own labels.

How Does Semi-Supervised Learning Work?

We’ve covered a lot of ground, so let’s review. Two of the most common forms of machine learning are supervised learning and unsupervised learning. The former tends to require a lot of labeled data to produce a useful model, while the latter can soak up a lot of hours in tinkering and yield clusters that are hard to understand. By training a model on a labeled subset of data and then applying it to the unlabeled data, you can save yourself tremendous amounts of effort.

But what’s actually happening under the hood?

Three main variants of semi-supervised learning are self-training, co-training, and graph-based label propagation, and we’ll discuss each of these in turn.

Self-training

Self-training is the simplest kind of semi-supervised learning, and it works like this.

A small subset of your data will have labels while the rest won’t have any, so you’ll begin by using supervised learning to train a model on the labeled data. With this model, you’ll go over the unlabeled data to generate pseudo-labels, so-called because they are machine-generated and not human-generated.

Now, you have a new dataset; a fraction of it has human-generated labels while the rest contains machine-generated pseudo-labels, but all the data points now have some kind of label and a model can be trained on them.

Co-training

Co-training has the same basic flavor as self-training, but it has more moving parts. With co-training you’re going to train two models on the labeled data, each on a different set of features (in the literature these are called “views”.)

If we’re still working on that plant classifier from before, one model might be trained on the number of leaves or petals, while another might be trained on their color.

At any rate, now you have a pair of models trained on different views of the labeled data. These models will then generate pseudo-labels for all the unlabeled datasets. When one of the models is very confident in its pseudo-label (i.e., when the probability it assigns to its prediction is very high), that pseudo-label will be used to update the prediction of the other model, and vice versa.

Let’s say both models come to an image of a rose. The first model thinks it’s a rose with 95% probability, while the other thinks it’s a tulip with a 68% probability. Since the first model seems really sure of itself, its label is used to change the label on the other model.

Think of it like studying a complex subject with a friend. Sometimes a given topic will make more sense to you, and you’ll have to explain it to your friend. Other times they’ll have a better handle on it, and you’ll have to learn from them.

In the end, you’ll both have made each other stronger, and you’ll get more done together than you would’ve done alone. Co-training attempts to utilize the same basic dynamic with ML models.

Graph-based semi-supervised learning

Another way to apply labels to unlabeled data is by utilizing a graph data structure. A graph is a set of nodes (in graph theory we call them “vertices”) which are linked together through “edges.” The cities on a map would be vertices, and the highways linking them would be edges.

If you put your labeled and unlabeled data on a graph, you can propagate the labels throughout by counting the number of pathways from a given unlabeled node to the labeled nodes.

Imagine that we’ve got our fern and rose images in a graph, together with a bunch of other unlabeled plant images. We can choose one of those unlabeled nodes and count up how many ways we can reach all the “rose” nodes and all the “fern” nodes. If there are more paths to a rose node than a fern node, we classify the unlabeled node as a “rose”, and vice versa. This gives us a powerful alternative means by which to algorithmically generate labels for unlabeled data.

Semi-Supervised Learning Examples

The amount of data in the world is increasing at a staggering rate, while the number of human-hours available for labeling it all is increasing at a much less impressive clip. This presents a problem because there’s no end to the places where we want to apply machine learning.

Semi-supervised learning presents a possible solution to this dilemma, and in the next few sections, we’ll describe semi-supervised learning examples in real life.

  • Identifying cases of fraud: In finance, semi-supervised learning can be used to train systems for identifying cases of fraud or extortion. Rather than hand-labeling thousands of individual instances, engineers can start with a few labeled examples and proceed with one of the semi-supervised learning approaches described above.
  • Classifying content on the web: The internet is a big place, and new websites are put up all the time. In order to serve useful search results it’s necessary to classify huge amounts of this web content, which can be done with semi-supervised learning.
  • Analyzing audio and images: This is perhaps the most popular use of semi-supervised learning. When audio files or image files are generated they’re often not labeled, which makes it difficult to use them for machine learning. Beginning with a small subset of human-labeled data, however, this problem can be overcome.

How Is Semi-Supervised Learning Different From…?

With all the different approaches to machine learning, it can be easy to confuse them. To make sure you fully understand semi-supervised learning, let’s take a moment to distinguish it from similar techniques.

Semi-Supervised Learning vs Self-Supervised Learning

With semi-supervised learning you’re training a model on a subset of labeled data and then using this model to process the unlabeled data. Self-supervised learning is different in that it’s showing an algorithm some fraction of the data (say the first 80 words in a paragraph) and then having it predict the remainder (the other 20 words in a paragraph.)

Self-supervised learning is how LLMs like GPT-4 are trained.

Semi-Supervised Learning vs Reinforcement Learning

One interesting subcategory of ML we haven’t discussed yet is reinforcement learning (RL). RL involves leveraging the mathematics of sequential decision theory (usually a Markov Decision Process) to train an agent to interact with its environment in a dynamic, open-ended way.

It bears little resemblance to semi-supervised learning, and the two should not be confused.

Semi-Supervised Learning vs Active Learning

Active learning is a type of semi-supervised learning. The big difference is that, with active learning, the algorithm will send its lowest-confidence pseudo-labels to a human for correction.

When Should You Use Semi-Supervised Learning?

Semi-supervised learning is a way of training ML models when you only have a small amount of labeled data. By training the model on just the labeled subset of data and using it in a clever way to label the rest, you can avoid the difficulty of having a human being label everything.

There are many situations in which semi-supervised learning can help you make use of more of your data. That’s why it has found widespread use in domains as diverse as document classification, fraud, and image identification.

So long as you’re considering ways of using advanced AI systems to take your business to the next level, check out our generative AI resource hub to go even deeper. This technology is changing everything, and if you don’t want to be left behind, set up a time to talk with us.

How Large Language Models Have Evolved

It seems as though large language models (LLMs) exploded into public awareness almost overnight. Relatively few people had heard of GPT-2, but I would venture to guess relatively few people haven’t heard of ChatGPT.

But like most things, language models have a history. And, in addition to being outrageously interesting, that history can help us reason about the progress in LLMs, as well as their likely future impacts.

Let’s get started!

A Brief History of Artificial Intelligence Development

The human fascination with building artificial beings capable of thought and action goes back a long way. Writing in roughly the 8th century BCE, Homer recounts tales of the god Hephaestus outsourcing repetitive manual tasks to automated bellows and working alongside robot-like “attendants” that were “…golden, and in appearance like living young women.”

No mere adornments, these handmaidens were described as having “intelligence in their hearts” and stirring “nimbly in support of their master” because “from the immortal gods they have learned how to do things.”

Some 500 years later, mathematicians in Alexandria would produce treatises on creating mechanical servants and various kinds of automata. Heron wrote a technical manual for producing a mechanical shrine and an automated theatre whose figurines could be activated to stage a full tragic play through an intricate system of cords and axles.

Nor is it only ancient Greece that tells similar tales. Jewish legends speak of the Golem, a being made of clay and imbued with life and agency through the use of language. The word “abracadabra”, in fact, comes from the Aramaic phrase avra k’davra, which translates to “I create as I speak.”

Through the ages, these old ideas have found new expression in stories such as “The Sorcerer’s Apprentice”, Mary Shelley’s “Frankenstein”, and Karel Čapek’s “R.U.R.”, a science fiction play that features the first recorded use of the word “robot”.

From Science Fiction to Science Fact

But they remained purely fiction until the early 20th Century, when advances in the theory of computation, as well as the development of primitive computers, began to offer a path toward actually building intelligent systems.

Arguably, the field of artificial intelligence really began in earnest with the 1950 publication of Alan Turing’s “Computing Machinery and Intelligence” – in which he proposed the famous “Turing test” – and with the 1956 Dartmouth conference on AI, organized by luminaries John McCarthy and Marvin Minsky.

People began taking AI seriously. Over the next ~50 years, there were numerous periods of hype and exuberance in which major advances were made, as well as long stretches, known as “AI winters”, in which funding dried up and little was accomplished.

Neural networks and the deep learning revolution are two advances that are particularly important for understanding how large language models have evolved over time, so it’s to these that we now turn.

Neural Networks And The Deep Learning Revolution

The groundwork for future LLM systems was laid by Walter Pitts and Warren McCulloch in the early 1940s. Inspired by the burgeoning study of the human brain, they wondered if it would be possible to build an artificial neuron that had the same basic properties as a biological one, i.e. it would activate and fire once a certain critical threshold had been crossed.

They were successful, though several other breakthroughs would be required before artificial neurons could be arranged into systems that were capable of doing useful work. One such breakthrough was backpropagation, the basic algorithm that is still used to train deep learning systems. Backpropagation was developed in 1960, and it uses the errors in a model’s outputs to iteratively adjust its internal parameters.

It wasn’t until 1985, however, that David Rumelhart, Ronald Williams, and Geoff Hinton used backpropagation in neural networks, and in 1989, this allowed Yann LeCun to train a convolutional neural network to recognize handwritten digits.

This was not the only architectural improvement that came out of this period. Especially noteworthy were the long short-term memory (LSTM) networks that were introduced in 1997 by Sepp Hochreiter and Jürgen Schmidhuber, which made it possible to learn more complex functions.

With these advances, it was clear that neural networks could be trained to do useful work, and that they were poised to do so. All that was left was to gather the missing piece: data.

The Big Data Era

Neural networks and deep-learning applications tend to be extremely data-hungry, and access to quality training data has always been a major bottleneck. In 2009 Stanford’s Fei-Fei Li sought to change this by releasing Imagenet, a database of over 14 million labeled images that could be used for free by researchers. The increase in available data, together with substantial improvements in computer hardware like graphical processing units (GPUs), meant that at long last the promise of deep learning could begin to be fulfilled.

And it was. In 2011 a convolutional neural network called “AlexNet” won multiple international competitions for image recognition, IBM’s Watson system beat several Jeopardy! all-stars in a real game, and Apple launched Siri. Amazon’s Alexa followed in 2014, and from 2015 to 2017 DeepMind’s AlphaGo shocked the world by utterly dominating the best human Go players.

Substantial strides were made in language models. In 2018 Google introduced its Bidirectional Encoder Representations from Transformers (BERT), a pre-trained model capable of a wide array of tasks, like text summarization, translation, and sentiment analysis.

One Model To Rule Them All

It would be easy to miss the significance of AlexNet’s performance on the ImageNet competition or BERT’s usefulness across multiple tasks. For a long time, it was anyone’s guess as to whether it would be possible to train a single large model on a dataset and use it for a range of purposes, or whether it would be necessary to train a multitude of models for each application.

From 2011 onwards, it has become clear that large, general-purpose models are often the best way to go. This point has only become more reinforced, with the success of GPT-4 in everything from brainstorming scientific hypotheses to handling customer service tasks.

How Has Large Language Model Performance Improved?

Now that we’ve discussed this history, we’re well-placed to understand why LLMs and generative AI have ignited so much controversy. People have been mulling over the promise (and peril) of thinking machines for literally thousands of years. After all that time it looks like they might be here, at long last.

But what, exactly, has people so excited? What is it that advanced AI tools are doing that has captured the popular imagination? In the following sections, we’ll talk about the astonishing (and astonishingly rapid) improvements that have been seen in language models in just a few short years.

Getting To Human-Level

One of the more surprising things about LLMs such as ChatGPT is just how good they are at so many different things. LLMs are trained with a technique known as “self-supervised learning”. They take random samples of the text data they’re given, and they try to predict what words come next given the words that came before.

Suppose the model sees the famous opening lines of Leo Tolstoy’s Ann Karenina: “Happy families are all alike; unhappy families are all unhappy in their own way.” What the model is trying to do is learn a function that will allow it to predict “in their own way” from “Happy families are all alike; unhappy families are all unhappy ___”.

The modern crop of LLMs can do this incredibly well, but what is remarkable is just how far this gets you. People are using generative AI to help them write poems, business plans, and code, create recipes based on the ingredients in their fridges, and answer customer questions.

Emergence in Language Models

Perhaps even more interesting, however, is the phenomenon of “emergence” in language models. When researchers tested LLMs on a wide variety of tasks meant to be especially challenging to these models – things like identifying a movie given a string of emojis or finding legal chess moves – they found that in about 5% of tasks, there is a sudden, sharp increase in ability on a given task once a model reaches a certain size.

At present, it’s not really clear how we should think about emergence. One hypothesis for emergence is that a big enough model is able to learn some general piece of knowledge not attainable by a smaller cousin, while another, more prosaic one is that it’s a relatively straightforward consequence of the model’s internal statistical machinery.

What’s more, it’s difficult to pin down the conditions required for emergence in language models. Though it generally appears to be a function of model size, there are cases in which the same abilities can be achieved with smaller models, or with models trained on very high-quality data, and emergence shows up at different scales for different models and tasks.

Whatever ends up being the case, it’s clear that this is a promising direction for future research. Much more work needs to be done to understand how precisely LLMs accomplish what they accomplish. This will not only redound upon the question of emergence, it will also inform the ongoing efforts to make language models safer and less biased.

The GPT Series

The big recent news in AI has, of course, been ChatGPT. ChatGPT has proven useful in an astonishingly-wide variety of use cases and is among the first powerful systems to have been made widely available to the public.

ChatGPT is part of a broader series of GPT models built by OpenAI. “GPT” stands for “generative pre-trained transformer”, and the first of its kind was developed back in 2018. New models and major updates have been released at a rapid clip ever since, culminating with GPT-4 coming out in March of 2023.

At present, OpenAI’s CEO Sam Altman has claimed that there are no current plans to train a successor GPT-5 model, but there are other companies, like DeepMind, who could plausibly build a competitor.

What’s Next For Large Language Models?

Given their flexibility and power, LLMs are finding use across a wide variety of industries, from software engineering to medicine to customer service.

If your interest has been piqued and you’d like to talk to an expert at Quiq about incorporating it into your business, reach out to us to schedule a demo!

Are Generative AI And Large Language Models The Same Thing?

The release of ChatGPT was one of the first times an extremely powerful AI system was broadly available, and it has ignited a firestorm of controversy and conversation.

Proponents believe current and future AI tools will revolutionize productivity in almost every domain.

Skeptics wonder whether advanced systems like GPT-4 will even end up being all that useful.

And a third group believes they’re the first sparks of artificial general intelligence and could be as transformative for life on Earth as the emergence of homo sapiens.

Frankly, it’s enough to make a person’s head spin. One of the difficulties in making sense of this rapidly-evolving space is the fact that many terms, like “generative AI” and “large language models” (LLMs), are thrown around very casually.

In this piece, our goal is to disambiguate these two terms by discussing ​​the differences between generative AI vs. large language models. Whether you’re pondering deep questions about the nature of machine intelligence, or just trying to decide whether the time is right to use conversational AI in customer-facing applications, this context will help.

Let’s get going!

What Is Generative AI?

Of the two terms, “generative AI” is broader, referring to any machine learning model capable of dynamically creating output after it has been trained.

This ability to generate complex forms of output, like sonnets or code, is what distinguishes generative AI from linear regression, k-means clustering, or other types of machine learning.

Besides being much simpler, these models can only “generate” output in the sense that they can make a prediction on a new data point.

Once a linear regression model has been trained to predict test scores based on number of hours studied, for example, it can generate a new prediction when you feed it the hours a new student spent studying.

But you couldn’t use prompt engineering to have it help you brainstorm the way these two values are connected, which you can do with ChatGPT.

There are many types of generative AI, so let’s spend a few minutes discussing the major categories: image generation, music generation, code generation, and a few others.

How Is Generative AI Used To Make Images?

One of the first “wow” moments in generative AI came fairly recently, when it was discovered that tools like Midjourney, DALL-E, and Stable Diffusion could create absolutely stunning images based on simple prompts like:

“Old man in a book store, ambient dappled sunlight, sedate, calm, close-up portrait.”

Depending on the wording you use, these images might be whimsical and futuristic, they might look like paintings from world-class artists, or they might look so photo-realistic you’d be convinced they’re about to start talking.

Created using DALL-E

Each of these tools is suited to specific applications. Midjourney seems to be best at capturing different artistic approaches and generating images that accurately capture an aesthetic. DALL-E tends to do better at depicting human figures, including faces and eyes. Stable Diffusion seems to do well at generating highly-detailed outputs, capturing subtleties like the way light reflects on a rain-soaked street.

(Note: these are all general impressions, it’s difficult to know how the tools will compare on any specific prompt.)

Broadly, this is known as “image synthesis”. And since we’re talking specifically about making images from text, this sub-domain is known as “text-to-image.”

A variant of this technique is text-to-video (alternatively: “text-to-4d”), which produces short clips or scenes based on text prompts. While text-to-video is still much more primitive than text-to-image, it will get better very quickly if recent progress in AI is any guide.

One interesting wrinkle in this story is that generative algorithms have generated something else along with images and animations: legal battles.

Earlier this year, Getty Images filed a lawsuit against the creators of Stable Diffusion, alleging that they trained their algorithm on millions of images from the Getty collection without getting permission first or compensating Getty in any way.

This has raised many profound questions about data rights, privacy, and how (or whether) people should be paid when their work is used to train a model that might eventually automate them out of a job.

We’re still in the early days of grappling with these issues, but they’re sure to make for fascinating case law in the years ahead.

How Is Generative AI Used To Make Music?

Given how successful advanced models have been in generating text (more on that shortly), it’s only natural to wonder whether similar models could also prove useful in generating music.

This is especially true because, on the surface, text and music share many obvious similarities (both are sequential, for example.) It would make sense, therefore, that the technical advances that have allowed coherent text production might also allow for coherent music production.

And they have! There are now a number of different tools, such as MusicLM, which are able to generate fairly high-quality audio tracks from prompts like:

“The main soundtrack of an arcade game. It is fast-paced and upbeat, with a catchy electric guitar riff. The music is repetitive and easy to remember, but with unexpected sounds, like cymbal crashes or drum rolls.”

As with using generative AI in images, creating artificial musical tracks in the style of popular artists has already sparked legal controversies. A particularly memorable example occurred just recently when a TikTok user supposedly created an AI-generated collaboration between Drake and The Weeknd, which then promptly went viral.

The track was removed from all major streaming services in response to backlash from artists and record labels, but it’s clear that this technology is going to change the way art is created in a major way.

How Is Generative AI Used For Coding?

It’s long been the dream of both programmers and non-programmers to simply be able to provide a computer with natural-language instructions (“build me a cool website”) and have the machine handle the rest. It would be hard to overstate the explosion in creativity and productivity this would initiate.

With the advent of code-generation models such as Replit’s Ghostwriter and GitHub Copilot, we’ve taken one more step towards that halcyon world.

As is the case with other generative models, code-generation tools are usually trained on massive amounts of data, after which point they’re able to take simple prompts and produce code from them.

You might ask it to write a function that converts between several different coordinate systems, create a web app that measures BMI, or translate from Python to Javascript.

As things stand now, the code is often incomplete in small ways. It might produce a function that takes an argument as input that is never used, for example, or which lacks a return function. Still, it is remarkable what has already been accomplished.

There are now software developers who are using models like ChatGPT all day long to automate substantial portions of their work, to understand new codebases with which they’re unfamiliar, or to write comments and unit tests.

What Are Large Language Models?

Now that we’ve covered generative AI, let’s turn our attention to large language models (LLMs).

LLMs are a particular type of generative AI.

Unlike with MusicLM or DALL-E, LLMs are trained on textual data and then used to output new text, whether that be a sales email or an ongoing dialogue with a customer.

(A technical note: though people are mostly using GPT-4 for text generation, it is an example of a “multimodal” LLM because it has also been trained on images. According to OpenAI’s documentation, image input functionality is currently being tested, and is expected to roll out to the broader public soon.)

What Are Examples of Large Language Models?

By far the most well-known example of an LLM is OpenAI’s “GPT” series, the latest of which is GPT-4. The acronym “GPT” stands for “Generative Pre-Trained Transformer”, and it hints at many underlying details about the model.

GPT models are based on the transformer architecture, for example, and they are pre-trained on a huge corpus of textual data taken predominately from the internet.

GPT, however, is not the only example of an LLM.

The BigScience Large Open-science Open-access Multilingual Language Model – known more commonly by its mercifully-short nickname, “BLOOM” – was built by more than 1,000 AI researchers as an open-source alternative to GPT.

BLOOM is capable of generating text in almost 50 natural languages, and more than a dozen programming languages. Being open-sourced means that its code is freely available, and no doubt there will be many who experiment with it in the future.

In March, Google announced Bard, a generative language model built atop its Language Model for Dialogue Applications (LaMDA) transformer technology.

As with ChatGPT, Bard is able to work across a wide variety of different domains, offering help with planning baby showers, explaining scientific concepts to children, or helping you make lunch based on what you already have in your fridge.

How Are Large Language Models Trained?

A full discussion of how large language models are trained is beyond the scope of this piece, but it’s easy enough to get a high-level view of the process. In essence, an LLM like GPT-4 is fed a huge amount of textual data from the internet. It then samples this dataset and learns to predict what words will follow given what words it has already seen.

At first, its performance will be terrible, but over time it will learn that a sentence like “I sat down on the _____” probably ends with a word like “floor” or “chair”, and probably not a word like “cactus” (at least, we hope you’re not sitting down on a cactus!)

When a model has been trained for long enough on a large enough dataset, you get the remarkable performance seen with tools like ChatGPT.

Is ChatGPT A Large Language Model?

Speaking of ChatGPT, you might be wondering whether it’s a large language model. ChatGPT is a special-purpose application built on top of GPT-3, which is a large language model. GPT-3 was fine-tuned to be especially good at conversational dialogue, and the result is ChatGPT.

Are All Large Language Models Generative AI?

Yes. To the best of our knowledge, all existing large language models are generative AI. “Generative AI” is an umbrella term for algorithms that generate novel output, and the current set of models is built for that purpose.

Utilizing Generative AI In Your Business

Though truly powerful generative AI models are less than a year old, they’re already being integrated into numerous business applications. Quiq Compose, for example, is able to study past interactions with customers to better tailor its future conversations to their particular needs.

From generating fake viral rap songs to generating photos that are hard to distinguish from real life, these powerful tools have already proven that they can dramatically speed up marketing, software development, and many other crucial business functions.

If you’re an enterprise wondering how you can use advanced AI technologies for applications like customer service, schedule a demo to see what the Quiq platform can offer you!

Prompt Engineering: What Is It—And How Can You Use It To Get The Most Out Of AI?

Think back to your school days. You come into class only to discover a timed writing assignment on the agenda. You have to respond to the provided prompt, quickly and accurately and will be graded against criteria like grammar, vocabulary, factual accuracy, and more.

Well, that’s what natural language processing (NLP) software like ChatGPT does daily. Except, when a computer steps into the classroom, it can’t raise its hand to ask questions.

That’s why it’s so important to provide AI with a prompt that’s clear and thorough enough to produce the best possible response.

What is prompt engineering?

A prompt can be a question, a phrase, or several paragraphs. The more specific the prompt is, the better the response.

Writing the perfect prompt — prompt engineering — is critical to ensure the NLP response is not only factually correct but crafted exactly as you intended to best deliver information to a specific target audience.

You can’t use low-quality ingredients in the kitchen to produce gourmet cuisine — and you can’t expect AI to, either.

Let’s revisit your old classroom again: did you ever have a teacher provide a prompt where you just weren’t really sure what the question was asking? So, you guessed a response based on the information provided, only to receive a low score.

In the post-exam review, the teacher explained what she was actually looking for and how the question was graded. You sat there thinking, “If I’d only had that information when I was given the prompt!”

Well, AI feels your pain.

The responses that NLP software provides are only as good as the input data. Learning how to communicate with AI to get it to generate desired responses is a science, and you can learn what works best through trial and error to continuously optimize your prompts.

Prompts that fail to deliver, and why.

What’s the root of the issue of prompt engineering gone wrong? It all comes down to incomplete, inconsistent, or incorrect data.

Even the most advanced AI using neural networks and deep learning techniques still needs to be fed the right information in the right way. When there is too little context provided, not enough examples, conflicting information from different sources, or major typos in the prompt, the AI can generate responses that are undesirable or just plain wrong.

How to craft the perfect prompt.

Here are some important factors to take into consideration for successful prompt engineering.

Clear instructions

Provide specific instructions and multiple examples to illustrate precisely what you want the AI to do. Words like “something,” “things,” “kind of,” and “it” (especially when there are multiple subjects within one sentence) can be indicators that your prompt is too vague.

Try to use descriptive nouns that refer to the subject of your sentence and avoid ambiguity.

  • Example (ambiguity): “She put the book on the desk; it was blue.”
  • What does “it” refer to in this sentence? Is the book blue, or is the desk blue?

Simple language

Use plain language, but avoid shorthand and slang. When in doubt, err on the side of overcommunicating and you can use trial and error to determine what shorthand approaches work for future, similar prompts. Avoid internal company or industry-specific jargon when possible, and be sure to clearly define any terms you may want to integrate.

Quality data

Give examples. Providing a single source of truth — for example, an article you want the AI to respond to questions about — will have a higher probability of returning factually correct responses based on the provided article.

On that note, teach the API how you want it to return responses when it doesn’t know the answer, such as “I don’t know,” “not enough information,” or simply “?”.

Otherwise, the AI may get creative and try to come up with an answer that sounds good but has no basis in reality.

Persona

Develop a persona for your responses. Should the response sound as though it’s being delivered by a subject matter expert or would it be better (legally or otherwise) if the response was written by someone who was only referring to subject matter experts (SMEs)?

  • Example (direct from SMEs): “Our team of specialists…”
  • Example (referring to SMEs): “Based on recent research by experts in the field…”

Voice, style, and tone

Decide how you want to represent your brand’s voice, which will largely be determined by your target audience. Would your customer be more likely to trust information that sounds like it was provided by an academic, or would a colloquial voice be more relatable?

Do you want a matter-of-fact, encyclopedia-type response, a friendly or supportive empathetic approach, or is your brand’s style more quick-witted and edgy?

With the right prompt, AI can capture all that and more.

Quiq takes prompt engineering out of the equation.

Prompt engineering is no easy task. There are many nuances to language that can trick even the most advanced NLP software.

Not only are incorrect AI responses a pain to identify and troubleshoot, but they can also hurt your business’s reputation if they aren’t caught before your content goes public.

On the other hand, manual tasks that could be automated with NLP waste time and money that could be allocated to higher-priority initiatives.

Quiq uses large language models (LLMs) to continuously optimize AI responses to your company’s unique data. With Quiq’s world-class Conversational AI platform, you can reduce the burden on your support team, lower costs, and boost customer satisfaction.

Contact Quiq today to see how our innovative LLM-built features improve business outcomes.

The Rise of Conversational AI: Why Businesses Are Embracing It

Movies may have twisted our expectations of artificial intelligence—either giving us extremely high expectations or making us think it’s ready to wipe out humanity.

But the reality isn’t on those levels. In fact, you’re already using AI in your daily life—but it’s so ingrained in your technology you probably don’t even notice. Netflix and Spotify both use AI to personalize your content recommendations. Siri, Alexa, and Google Assistant use it as well.

Conversational AI, like what Quiq uses to power our chatbots, takes artificial intelligence to the next level. See what it is and how you can use it in your business.

What is conversational AI?

Conversational artificial intelligence (AI) is a collection of technologies that create a human-like experience. It combines natural language processing (NLP), machine learning, and other technologies to enhance streamlined conversations. This can be used in many applications, like chatbots and voice (like Siri and Alexa). The most common use case for conversational AI in the business-to-customer world is through an AI chatbot messaging experience.

Unlike rule-based chatbots, those powered by conversational AI generate responses and adapt to user behavior over time. Rule-based chatbots were also limited to what you put in them—meaning if someone phrased a question differently than you wrote it (or used slang/colloquialisms/etc.), it wouldn’t understand the question. Conversational AI can also help chatbots understand more complex questions.

Putting technical terms in context.

Companies throw around a lot of technical terms when it comes to artificial intelligence, so here are what they mean and how they’re used to improve your business.

Rules-based chatbots: Earlier chatbot iterations (and some current low-cost versions) work mainly through pre-defined rules. Your business (or service provider) writes specific guidelines for the chatbot to follow. For example, when a customer says “Hi,” the chatbot responds, “Hello, how may I help you?”

Another example is when a customer asks about a return. The chatbot is programmed to give a specific response, like, “Here’s a link to the return policy.”

However, the problem with rule-based chatbots is that they can be limiting. It only knows how to handle situations based on the information programmed into it. So if someone says, “I don’t like this product, what can I do?” and you haven’t planned for that question, the chatbot won’t have a response.

Machine learning: Machine learning is a way to combat the problem posed above. Instead of giving the chatbot specific parameters complete with pre-written questions and answers, machine learning helps chatbots make decisions based on the information provided.

Machine learning helps chatbots adapt over time based on customer conversations. Instead of giving the bot specific ways to answer specific questions, you show it the basic rules, and it crafts its own response. Plus, since it means your chatbot is always learning, it gets better the longer you use it.

Natural language processing: As humans and speakers of the English language, we know that there are different ways to ask every question. For example, a customer who wants to know when an item is back in stock may ask, “When is X back in stock?” or they might say, “When will you get X back in?” or even, “When are you restocking X?” Those three questions all mean the same thing, and as humans, we naturally understand that. But a rules-based bot must be told that those mean the same things, or they might not understand it.

Natural language processing (NLP) uses AI technology to help chatbots understand that those questions are all asking the same thing. It also can determine what information it needs to answer your question, like color, size, etc.

NLP also helps chatbots answer questions in a more human-like way. If you want your chatbot to sound more human (and you should), then find one that uses NLP.

Web-based SDK: A web-based SDK (that’s a software development kit for non-developers) is a set of tools and resources developers use to integrate programs (in this case, chatbots) into websites and web-based applications.

What does this mean for your chatbot? Context. When a user says, “I need help with my order,” the chatbot can use NLP to identify “help” and “order.” Then it can look back at previous conversations, pull the customers’ order history, and more—if the data is there.

Contextual conversations are everything in customer service—so this is a big factor in building a successful chatbot using conversational AI. In fact, 70% of customers expect anyone they’re speaking with to have the full context. With a web-based SDK, your chatbot can do that too.

The benefits of conversational AI.

Using chatbots with conversational AI provides benefits across your business, but the clearest wins are in your contact center. Here are three ways chatbots improve your customer service.

24/7 customer support.

Your customer service agents need to sleep, but your conversational AI chatbot doesn’t. A chatbot can answer questions and contain customer issues while your contact center is closed. Any issues they can’t solve, they can pass along to your agents the next day. Not only does that give your customers 24/7 service, but your agents will have less of a backlog when they return to work.

Faster response times.

When your agents are inundated with customers, an AI chatbot can pick up the slack. Send your chatbot in to greet customers immediately, let them know the wait time, or even start collecting information so your agents can get to the root of the problem faster. Chatbots powered with AI can also answer questions and solve easy customer issues, skipping human agents altogether.

For more ways AI chatbots can improve your customer service, read this >

More present customer service agents.

Chatbots can handle low-level customer queries and give agents the time and space to handle more complex issues. Not only will this result in better customer service, but agents will be happier and less stressed overall.

Plus, chatbots can scale during your busy seasons. You’ll save on costs since you won’t have to hire more agents, and the agents you have won’t be overworked.

How to make the most of AI technology.

Unfortunately, you can’t just plug and play with conversational AI and expect to become an AI company. Just like any other technology, it takes prep work and thoughtful implementation to get it right—plus lots of iterations.

Use these tips to make the most of AI technology:

Decide on your AI goals.

How are you planning on using conversational AI? Will it be for marketing? Customer service? All of the above? Think about what your main goals are and use that information to select the right AI partner.

Choose the right conversational AI platform.

Once you’ve decided on how you want to use conversational AI, select the right partner to help you get there. Think about aspects like ease of use, customization, scalability, and budget.

Design your chatbot interactions.

Even with artificial intelligence, you still have to put the work in upfront. What you do and how you do it will vary greatly depending on which platform you go with. Design your chatbot conversations with these things in mind:

  • Your brand voice
  • Personalization
  • Customer service best practices
  • Logical conversation flows
  • Concise messages

Build a partnership between agents and chatbots.

Don’t launch the chatbot independently of your customer service agents. Include them in the training and launch, and start to build a working relationship between the two. Agents and chatbots can work together on customer issues, both popping in and out of the conversation seamlessly. For example, a chatbot can collect information from the customer upfront and pass it to the agent to solve the issue. Then, when the agent is done, they can bring the chatbot back in to deliver a customer survey.

Test and refine.

Sometimes, you don’t know what you don’t know until it happens. Test your chatbot before it launches, but don’t stop there. Keep refining your conversations even after you’ve launched.

What does the future hold for conversational AI?

There are many exciting things happening in AI right now, and we’re only on the cusp of delving into what it can really do.

The big prediction? For now, conversational AI will keep getting better at what it’s already doing. More human-like interactions, better problem-solving, and more in-depth analysis.

In fact, 75% of customers believe AI will become more natural and human-like over time. Gartner is also predicting big things for conversational AI, saying by 2026, conversational AI deployments within contact centers will reduce agent labor costs by $80 billion.

Why should you jump in now when bigger things are coming? It’s simple. You’ll learn to master conversational AI tools ahead of your competitors and earn an early competitive advantage.

How Quiq does conversational AI.

To ensure you give your customers the best experience, Quiq powers our entire platform with conversational AI. Here are a few stand-out ways Quiq uniquely improves your customer service with conversational AI.

Design customized chatbot conversations.

Create chatbot conversations so smooth and intuitive that it feels like you’re talking to a real person. Using the best conversational AI techniques, Quiq’s chatbot gives customers quick and intelligent responses for an up-leveled customer experience.

Help your agents respond to customers faster.

Make your agents more efficient with Quiq Compose. Quiq Compose uses conversational AI to suggest responses to customer questions. How? It uses information from similar conversations in the past to craft the best response.

Empower agent performance.

Tools like our Adaptive Response Timer (ADT) prioritizes conversations based on how fast or slow customers respond. The conversational AI platform also uses AI to analyze customer sentiment to give extra attention to customers who need it.

This is just the beginning.

This is just a taste of what conversational AI can do. See how Quiq can apply the latest technology to your contact center to help you deliver exceptional customer service.

Contact Us

Customer Service in the Travel Industry: How to Do More with Less

Doing more with less is nothing new for the travel industry. It’s been tough out there for the last few years—and while the future is bright, travel and tourism businesses are still facing a labor shortage that’s causing customer satisfaction to plummet.

While HR leaders are facing the labor shortage head-on with recruiting tactics and budget increases, customer service teams need to search for ways to provide the service the industry is known for without the extra body count.

In other words… You need to do more with less.

The best way to do that is with a conversational AI platform. Whether a hotel, airline, car rental company or experience provider, you can provide superior service to your customers without overworking your support team.

Keep reading to take a look at the state of the travel industry’s labor shortage and how you can still provide exceptional customer service.

Travel is back, but labor is not.

In 2019, the travel and tourism industry accounted for 1 in 10 jobs around the world. Then the pandemic happened, and the industry lost 62 million jobs overnight, according to the World Travel & Tourism Council (WTTC).

Now that most travel restrictions, capacity limits, and safety restrictions are lifted, much of the world is ready to travel again. The pent-up demand has caused the tourism and travel industry to outpace overall economic growth. In 2021, the GDP grew by 21.7%, while the overall economy only grew by 5.8%, according to the WTTC.

In 2021, travel added 18.2 million jobs globally, making it difficult to keep up with labor demands. In the U.S., 1 in 9 jobs went unfilled in 2021.

What’s causing the shortage? A combination of factors:

  • Flexibility: Over the last few years, there has been a mindset shift when it comes to work-life balance. Many people aren’t willing to give up weekends and holidays with their families to work in hospitality.
  • Safety: Many jobs in hospitality work on the frontline, interacting with the public on a regular basis. Even though the pandemic has cooled in most parts of the world, some workers are still hesitant to work face-to-face. This goes double for older workers and those with health concerns, who may have either switched industries or dropped out of the workforce altogether.
  • Remote work: The pandemic made remote work more feasible for many industries, and travel requires a lot of in-person work and interactions.

How is the labor shortage impacting customer service?

As much as we try to separate those shortages from affecting service, customers feel it. According to the American Customer Satisfaction Index, hotel guests were 2.7% less satisfied overall between 2021 and 2022. Airlines and car rental companies also dropped 1.3% each.

While there are likely multiple reasons factoring into lower customer satisfaction rates, there’s no denying that the labor shortage has an impact.

As travel ramps back up, there’s an opportunity to reshape the industry at a fundamental level. The world is ready to travel again, but demand is outpacing your ability to grow. While HR is hard at work recruiting new team members, it’s time to look at your operations and see what you can do to deliver great customer service without adding to your staff.

What a conversational AI platform can do in the travel industry.

First, what is conversational AI? Conversational AI combines multiple technologies (like machine learning and natural language processing) to enable human-like interactions between people and computers. For your customer service team, this means there’s a coworker that never sleeps, never argues, and seems to have all the answers.

A conversational AI platform like Quiq can help support your travel business’s customer service team with tools designed to speed conversations and improve your brand experience.

In short, a conversational AI platform can help businesses in the travel industry provide excellent customer service despite the current labor shortage. Here’s how.

Contact Us

Resolve issues faster with conversational AI support.

When you’re short-staffed, you can’t afford inefficient customer conversations. Switching from voice-based customer service to messaging comes with its own set of benefits.

Using natural language processing (NLP), a conversational AI platform can identify customer intent based on their actions or conversational cues. For example, if a customer is stuck on the booking page, maybe they have a question about the cancellation policy. By starting with some basic customer knowledge, chatbots or human agents can go into the conversation with context and get to the root of the problem faster.

Conversational AI platforms can also route conversations to the right agent, so agents spend less time gathering information and more time solving the problem. Plus, messaging’s asynchronous nature means customer service representatives can handle 6–8 conversations at once instead of working one-on-one. But conversational AI for customer service provides even more opportunities for speed.

Anytime access to your customer service team.

Many times, workers leaving the travel industry cite a lack of schedule flexibility as one of their reasons for leaving. Customer service doesn’t stop at 5 o’clock, and support agents end up working odd hours like weekends and holidays. Plus, when you’re short-staffed, it’s harder to cover shifts outside of normal business hours.

Chatbots can help provide customer service 24/7. If you don’t already provide anytime customer service support, you can use chatbots to answer simple questions and route the more complex questions to a live agent to handle the next day. Or, if you already have staff working evening shifts, you can use chatbots to support them. You’ll require fewer human agents during off times while your chatbot can pick up the slack.

Connect with customers in any language.

Five-star experiences start with understanding. You’re in the travel business, so it’s not unlikely that you’ll encounter people who speak different languages. When you’re short-staffed, it’s hard to ensure you have enough multilingual support agents to accommodate your customers.

Conversational AI platforms like Quiq offer translation capabilities. Customers can get the help they need in their native language—even if you don’t have a translator on staff.

Work-from-anywhere capabilities.

One of the labor shortage’s root causes is the move to remote work. Many customer-facing jobs require working in person. That limits your labor pool to people within the immediate area. The high cost of living in cities with increased tourism can push locals out.

Moving to a remote-capable conversational tool will expand your applicant pool outside your immediate area. You can attract a wider range of talented customer service agents to help you fill open positions.

Build automation to anticipate customer needs.

A great way to reduce the strain on a short-staffed customer service team? Prevent problems before they happen.

A lot of customer service inquiries are simple, routine questions that agents have to answer every day. Questions about cancellation policies, cleaning and safety measures, or special requests happen often—and can all be handled using automation.

Use conversational AI to set up personalized messages based on behavioral or timed triggers. Here are a few examples:

  • When customers book a vacation: Automatically send a confirmation text message with their booking information, cancellation policy, and check-in procedures.
  • The day before check-in: Send a reminder with check-in procedures, along with an option for any special requests.
  • During their vacation: Offer up excursion ideas, local restaurant reservations, and more. You can even book the reservation or complete the transaction right within the messaging platform.
  • After an excursion: Send a survey to collect feedback and give customers an outlet for their positive or negative feedback.

By anticipating these customer needs, your agents won’t have to spend as much time fielding simple questions. And the easy ones that do come in can be handled by your chatbot, leaving only more complex issues for your smaller team.

Don’t let a short staff take away from your customer service.

There are few opportunities to make something both cheaper and better. Quiq is one of them. Quiq’s conversational AI Platform isn’t just a stop-gap solution while the labor market catches up with the travel industry’s needs. It will actually improve your customer service experience while helping you do more with less.

Using AI to Streamline Messaging

Conversational AI typically refers to leveraging bots to satisfy your customers while scaling your contact center.

At Quiq, we love bots, but we also take a broader view of Conversational AI. After all, bots are only part of digital CX.

In our view, Conversational AI also means helping your live agents work more efficiently and streamlining your operations.

In this article, we’re going to focus on how AI can (and should) be used to manage the nuances of messaging as part of a broader suite of Conversational AI.

The Need for Conversational AI

In The Nature Of Messaging, we described how messaging is a unique channel. It fluctuates between synchronous and asynchronous communication styles. It’s informal. Live agents can work on multiple messaging conversations concurrently.

All of this implies a system that can:

1. Track and prioritize the conversations assigned to a live agent.

In order to prioritize, we must understand who is expected to respond next.

2. Manage the agent’s workload.

  • Keep them busy, but not too busy.
  • Prioritize customers who are actively engaged.
  • Move inactive customers out of the way, without losing their session.

3. Map free-flowing streams of messages into tickets in traditional CRM systems.

In order to achieve the above, you need a purpose-built system (like Quiq) that handles the fluctuating synchronicity of conversations amidst the backdrop of agent concurrency. The system is also going to need a hefty dose of AI to do the best possible job.

Let’s consider some examples to explore why.

Here’s a pretty typical inbound service conversation that was routed directly to a live agent.

The agent sent the last message, but is it the customer’s turn to respond?

No.

The agent essentially promised a follow-up. The system should still prioritize this conversation.

The traditional algorithm employed in email management systems is that the two parties should take turns, but that doesn’t work in conversational settings because messages are shorter and less formal.

We need NLP/AI here.

Here we have the opposite situation. The system shouldn’t prioritize this conversation or set any sort of SLA timer because the burden of response is on the customer.

If the customer fails to follow up within a reasonable timeframe (10 minutes?), the system should move this conversation to an inactive state to make room for customers who are more engaged.

Do you remember choose your own adventure books? In this example, you get to pick what happens next:

  1. Nothing. The conversation is over.
  2. An hour passes. The customer responds with “You too!”
  3. An hour passes. The customer responds with “Actually, I don’t want green after all.”
  4. An hour passes. The customer responds with “I have a different question for you.”

Compared to phone and email, it’s less clear when a messaging conversation is actually over.

Obviously, we don’t want to just leave the conversation open; that delays helping other customers.

So the system should automatically inactivate and/or close it.

If scenario 2 happens, what should we do?

We definitely don’t want to open another tracking ticket, and we may not even want to reopen the conversation and route it to the agent (especially if that agent isn’t online anymore). We call this the ‘long goodbye’ problem, or more generally, an unimportant response.

If scenario 3 happens, we need to reopen the conversation and it should be associated with the same ticket in the CRM and ideally routed to the same agent.

If scenario 4 happens, we should start a new conversation associated with a new ticket and route the conversation to our entry point (e.g. a bot) rather than directly routing to the agent.

In messaging apps, there isn’t a clear start and end point to a conversation—and there isn’t an equivalent of an email ‘Subject’.

It’s just a stream of messages with potentially long delays between them. So in order to solve the above problems, we need a deep understanding of the message content.

The Impact

The examples we explored above aren’t just academic. They’re impactful to your operations.

Consider the following stats taken from across our user base (your org’s exact numbers might differ):

10% of conversations will have a late, ‘unimportant’ message arrive.

  • Failure to recognize these as continuations of the previous conversation causes superfluous records that impact analytics.
  • Agents are unnecessarily distracted.

The traditional ‘take turns’ response algorithm is wrong 30% of the time in messaging.

  • If we fail to prioritize a conversation where the customer is expecting a response, we risk missing SLAs and angering customers, while forcing agents to attempt their own prioritization.
  • If we prioritize a conversation that is actually waiting on the customer, we decrease efficiency by distracting the agent and delaying service to other customers.

20% of your messaging conversations will reopen in a 72-hour period.

It’s important to recognize when an important message arrives and determine if it represents a new topic or a continuation.

Our Approach

At Quiq, our goal is to leverage AI to have a positive and immediate impact on our customers and their businesses.

We follow the latest research and pragmatically adapt and apply AI to the context of conversational business messaging.

For the majority of our AI modeling tasks, it’s not sufficient to simply consider the text of a single message in order to make a decision.

Instead, we must consider all of the recent transcripts, including the sequence of individual messages and their authors. This deep understanding of the conversation transcript enables us to achieve high accuracy on problems like the ones presented in this article.

Stay tuned as we build out more!