Voice Activated Technology — How Does it Work?

Limor
3 min readDec 31, 2020

--

If you would prefer to listen to this post on Limor click here https://preview.limor.ie/podcast/15641?_branch_match_id=873175213111127064

How does voice activated technology actually work?

So we all use voice activation but how does it actually work?

Voice activation refers to the ability of a machine to receive and interpret dictation, to understand and carry out spoken commands.

You already know what it is, it’s your Amazon Alexa, your Siri on your phone, basically every time you ask a question to a robot and it answers.

If you’ve ever called up a help line and it was answered by a machine asking a set of questions that you have to answer yes or no to… that is voice activation technology.

It is common for us to assume that speech recognition is a fairly straightforward technology, but is extremely advanced and has improved exponentially or the last number of years with Alexa for example, you’ll realise how much better the quality of voice activation has become.

To get to the level it is now voice activation technology required the integration of decades of advancements in AI-driven natural-language-processing, speech recognition, computing horsepower, and wireless networking, to name just a few building blocks. With an estimated 6,500 languages spoken globally today, coupled with differing accents, intonation, inflection and pronunciation, the technology used for a computer to understand language as precisely as a human is extremely complex.

There are an estimated 6500 languages spoken globally

It really is an incredible technology, speech recognition software programs have to analyse the sounds received by a transmitter and perform specific tasks based on the information given to them. Personal assistants (like Amazon’s Alexa) then decipher the information being input via voice and will then attempt to perform what has been asked of them.

It’s crazy that we take this for granted.

The software used for this requires analogue audio to be converted into digital signals, and for a computer to decipher a signal it must first have a digital database (or vocabulary) of words and syllables that it has learnt, or been taught, as well as the ability to compare this data to signals.

Maybe you’ve seen this with the app Duo Lingo? You have to speak your new language into the phone when prompted and it’ll tell you if you’ve pronounced it well enough or not.

So it’s pretty impressive stuff although it may seem so ordinary now and as time goes on it is expected to only get better and better.

We’re only just starting to grasp the potential of these technologies. Voice is the ultimate user interface because it’s not really a UI, but part of what we are as humans and how we communicate.

There’s almost no learning curve required like there is when people take typing classes. Voice-enabled machines learn to adapt to our natural behaviors rather than the other way around.

The question we need to ask is what could be the future capabilities of voice activation and recognition?

Have we reached the peak with Alexa? what do you think?

Have we reached the peak of voice activated technology or is it only getting started?

Voice technology is set to also make a huge impact in the learning and developing world, will it substitute humans delivering verbal information altogether? news briefings for example, will news anchors become obsolete?!

Voice activation technology only continues to deliver novel ways to electronically control the devices in your home with topical learning content, business updates, recommended reading and more.

What are your thoughts?

Join the conversation on Limor and let us know, experience the power of voice for yourself.

--

--

Limor

Limor is a new social audio platform that makes the process of podcasting production and distribution easy, instant and interactive sparking real conversation