Both chatbots, however, share one thing in common. They answer questions through text-based interfaces. In simple terms, you’d chat with a bot the same way you’d send a text message.
Chatbots are incredibly beneficial for businesses looking to automate customer service, marketing, and even sales tasks. But will chatbots of the future be accessed with our fingertips, or will it require no typing at all?
Thanks to voice assistants, the future of chatbots may be speech-based.
What is a voice assistant?
Do the phrases “Alexa,” “Ok Google,” or “Hey Siri” sound familiar? These are actually considered voice assistants, and tens of millions of users interact with them daily.
What is a voice assistant?
Voice assistants are bots powered by artificial intelligence, voice recognition, and natural language processing (NLP) to answer questions and hold conversations audibly.
While text-based interfaces require machines to process text, analyze it, and map out a response, voice assistants do this audibly. In simple terms, you could speak to voice assistants out loud instead of having to click on call-to-action buttons or type out your question.
The technology behind voice assistants, however, is quite complex and relatively new compared to text-based interfaces. To get a better understanding of voice assistants and their place in chatbot marketing, let’s look at how exactly they work.
How do voice assistants work?
From a high level, we know voice assistants answer questions and hold conversations with users out loud instead of through text-based interfaces, but this is an oversimplification of how they work. Below are the various steps required for voice assistants to give us our desired answers.
1. Some bots use passive listening
Voice assistants like Alexa, Cortana, and other consumer-facing bots are considered passive listening devices. This essentially means the assistant is constantly monitoring its surroundings for trigger words. Once the trigger word is said loud enough for the bot to hear, it will begin listening to the user’s query.
Other voice assistants like Siri or Google Assistant have options to either be passive listeners or tap/touch activated. Some users prefer more control over their devices with recent concerns surrounding data privacy.
2. Voice recognition kicks in
The bot has been activated and now it’s ready to listen, but how exactly does it know what it’s listening to? This is made possible with voice recognition software, a subset of artificial intelligence and deep learning.
Sound waves are converted into structured, more understandable data for the machine to process. Everything from tone, pitch, volume, and the precision of speech will be factored in with voice recognition.
Of course, this is underplaying the complexity of voice recognition, as it’s one of the most challenging problems in computer science today.
3. Followed by natural language processing
More complex nuances of the human language also need to be broken down before information retrieval. This includes things like context, user intent, slang, accents, and other loosely formal aspects of the human language.
Humans and machines are on totally different wavelengths when it comes to language. While we have no rigid guidelines, machines require structure, detail, and process.
After processing the user’s query using voice recognition and NLP, it’s now time for the voice assistant to retrieve information related to the question. Voice assistants do this by calling on various APIs and accessing something called a knowledge base, which acts as a central repository to draw information from.
The depth of the knowledge base varies from one device to another, but many mainstream voice assistants today are quite fleshed out. Below is an example of what a knowledge base may look like:
More information can be added to the knowledge base over time. This information is tagged so machine learning knows exactly where to look for it. The larger and more organized the knowledge base, the fewer errors will occur and the faster the chatbot is able to learn.
5. Information is then output
Now onto the final step, outputting relevant information for the user.
A lot has led up to this point. Different tones, vibrations, and volumes are standardized for the machine with voice recognition. Natural language processing then assists the machine with understanding exactly what it just heard. Then, information is retrieved from a variety of sources. The end product is an answer that hopefully satisfies the user’s request.
It’d be an understatement to say there’s a lot of moving parts in the few seconds between asking a question and receiving an answer.
So, now that we’re familiar with how voice assistants work, let’s look at the use cases for these complex bots.
When to use voice assistants
Voice assistants have become quite popular amongst consumers. Amazon alone has sold more than 50 million of its Echo units – the device that powers the Alexa bot.
Most consumers are simply using their devices to check the weather, who won last night’s game, what’s the capital of Vermont, and other simple voice commands.
Only two percent of users are actually making purchases through their voice assistants, and about 20 percent ask their assistants to check the status of online orders.
A lack of graphical user interface (GUI) is the main reason for consumers’ lack of confidence purchasing through their voice assistants. A GUI allows users to compare different products, look at reviews, and dive deeper into research.
Consumers and voice assistants go hand-in-hand, but we’ll soon see more businesses leverage voice assistants to automate day-to-day tasks.
In a recent survey of more than 600 senior decision-makers, 31 percent see voice technology as beneficial for daily work.
One example of this is in business intelligence where decision-makers rely on graphs, charts, and dashboards to break down KPIs and reports. Using a voice assistant, the decision-maker can receive these reports audibly without having to shift priorities.
Another example is in human resources (HR) and recruiting, both of which can benefit from automation. Imagine having a voice assistant break down different candidate profiles using current employee baselines and models, along with market data. This would remove lengthy processes, and the recruiter can simply assess the cultural fit of the candidate – streamlining the hiring process.
Are voice assistants the future?
For now, it’s evident that voice assistants are better at resolving simple, non-business related questions for human users. But when it comes to customer support, marketing, and sales tasks, text-based chatbots take the cake.
This isn’t to say voice assistants aren’t the future, only more time is needed to map out use cases in business. Advancements in AI, NLP, and machine learning will open up new opportunities.
One looming question is when will users be comfortable enough to make purchases through voice assistants? Without a GUI giving users more control, the answer may be “never.” This is why companies like Google and Facebook have developed “portal” bots which provide the benefits of both GUI and voice assistance.
Devin is a growth marketer at Nextiva and a former content specialist at G2. Prior to G2, he helped scale early-stage startups out of Chicago's booming tech scene. Outside of work, he enjoys watching his beloved Cubs, playing baseball, and gaming. (he/him/his)