When was speech recognition invented?

History of voice assistants

The history of voice assistants

The history of voice assistants goes back further than many believe. Because it began as early as 1877 when Thomas Edison developed his phonograph. Even if the simple dictation machine at first glance has little in common with modern language assistants, it still offered the essential basis: the recording of speech. To date, digital and virtual assistance systems have developed significantly. We give an overview of the history of voice assistants.

Voice assistants: the milestones in history

• 1877: Ediphone and the revolution in executive offices

• 1930: The Voder was able to synthesize speech

• 1952: Audrey understood numbers from zero to nine

• 1962: A “shoe box” solved arithmetic problems

• 1970: Hidden Markov Model predicts word sequences

• 1971: US Department of Defense funds research

• 1980s: IBM works on Tangora speech recognition system

• 1990s: Big strides in the history of voice assistants

• 2000s: Speech recognition becomes the standard on the computer

• 2010s: Voice assistants conquer smartphones

1877: Ediphone and the revolution in executive offices

Thomas Alva Edison didn't just invent the light bulb. He also laid the foundation for the development of today's voice assistants. As early as 1877, the American inventor developed a dictation machine that could record sounds and play them back over and over again. The phonograph worked purely mechanically and was sold as "Parlograph" or "Ediphone" until the 1920s. The function that even the simplest smartphones can master today caused a real revolution back then - especially in the offices of large companies. Because never before was it possible to dictate letters, instructions or anything else independently of the staff present.

Click here for the video

1930: The Voder was able to synthesize speech

Is it possible to generate human language with a machine without recording it beforehand? Bell Laboratories dealt with this question as early as the 1930s. The American scientists then invented a keyboard-controlled electronic speech synthesizer. The so-called Voice Operation Demonstrator (Voder for short) was able to generate speech artificially for the first time. It was laborious to operate manually and was presented to the public at the 1939 World's Fair in New York.

1952: Audrey understood numbers from zero to nine

The voder could output human speech. However, this function alone is not enough to develop a voice assistant. Because it must also be able to understand and evaluate speech signals. It was possible for the first time with Audrey. The Automatic Digit Recognizer was invented by Bell Laboratories in 1952 and could understand digits from zero to nine. The accuracy was around 90 percent, but also strongly depended on the speed of speech, the voice and the dialect of the speaker. Besides, Audrey only understood when people paused long between digits.

1962: A “shoe box” solved arithmetic problems

Another milestone in the history of voice assistants was the year 1962. At that time, IBM presented a machine at the world exhibition in Seattle that understood 16 terms. The so-called "Shoebox" was the size of a shoebox and could do arithmetic. Because she recognized the digits zero to nine as well as the instructions "minus", "plus", "subtotal", "total", "false" and "of". The first task at the world exhibition "Five plus three plus eight plus seven plus four minus nine, together" was solved correctly with the result 17 and the audience was enthusiastic. Interesting to know: IBM brought up the first personal computer almost 20 years later the market.

Click here for the video

1970: Hidden Markov Model predicts word sequences

The so-called hidden Markov model was decisive for the recent history of language assistants. A stochastic model that should model systems with unobserved states. It was named after the Russian mathematician Andrei Andrejewitsch Markow and was installed in the second half of the 1960s. One of the first concrete use cases was speech recognition. The Hidden Markov Model (HMM) was used here around 1970 to calculate the probability of a certain word following another. This was important in order to be able to better distinguish word sequences with a similar sound from one another.

1971: Department of Defense funds research

From 1971 to 1976 scientists at Carnegie Mellon University (CMU) in Pittsburgh achieved further successes in the history of language assistants. Funded by the Defense Advanced Research Projects Agency (DARPA), an agency of the United States Department of Defense, three speech recognition and understanding systems were created. Namely Dragon, Harpy and Hearsay-II.

Dragon was not tested by DARPA at the time, but formed the basis for a commercial product called "Dragon NaturallySpeaking". The latter has been continuously developed to this day. After several takeovers, it is now owned by the American company Nuance Communications (formerly ScanSoft).

Harpy used heuristic search methods to identify spoken sentences. The system processed a vocabulary of around 1,000 words and correctly understood 95 percent of the test sentences spoken in the DARPA project. It is interesting to know that Harpy has to execute around 30 million computer instructions to understand speech for a second. With the computers of the time, it was still a long way from real-time recognition, but it was an important milestone in the history of voice assistants.

Just like Dragon, Hearsay-II could not meet the DARPA requirements at the time. However, the developers started with what is probably the most ambitious approach: the step-by-step formation of sentences. The so-called blackboard architecture was used. Illustrated, the systems wrote the recognized phones at the bottom of a board. Knowledge routines created syllables from these, which were then processed by other routines into words, word sequences and sentences.

1980s: IBM works on Tangora speech recognition system

In the 1980s, the computer company IBM was actively working on a speech recognition system called Tangora. It recognizes sentences from isolated words and processes a vocabulary of around 20,000 terms in real time. Speech recognition is based on a purely statistical method that works without any linguistic knowledge. When the solution was presented under the name Tangora 4 at CeBIT in 1991, the presentation room had to be completely shielded. Otherwise the trade fair noise would have disrupted work considerably.

1990s: great strides in the history of voice assistants

The history of voice assistants continued to pick up speed in the 1990s. For example, a doll named Julie brought speech recognition technology into children's rooms. Julie could understand simple words and react individually to her counterpart. A real sensation at the time. Click here for the video

In 1990, Dragon (now part of Nuance Communications) released Dragon Dictate. It was the first consumer speech recognition program that has been developed and marketed to this day. In the following years, other end-user programs appeared, such as Speakable items (Apple), Sphinx-II (Xuedong Huang) and MedSpeak (IBM).

2000s: Speech recognition becomes the standard on the computer

In the 2000s, more and more manufacturers were using speech recognition systems to operate programs or entire operating systems. Microsoft integrated the speech recognition functions into its Office products in 2002. In 2007, speech recognition was even completely incorporated into the then new Windows Vista operating system.

The possibilities of modern technology were now also being used by others. In 2006 the National Security Agency (NSA) began automatically filtering out individual keywords from intercepted conversations. Just a year later, Google launched the first voice recognition-based business directory search in the form of GOOGLE-411. Callers received information about local businesses via the number 1-800-Demokratie-411, which they could be connected to directly.

2010s: voice assistants conquer smartphones

With the beginning of the 2010s, the history of voice assistants arrives today. More and more providers are bringing out digital assistants with which everyone can conveniently control smartphones, tablets or PCs using voice commands. The digital assistants do daily tasks and answer their users to many questions.

SIRI Inc., founded in 2007, reached a milestone. The company developed the intelligent personal assistant Siri and was bought by Apple in April 2010. Then it would take a year before the tech company from California brought out the voice assistant for the iPhone 4s. Today Siri runs on all Apple devices and processes well over 2 billion requests a week.

With the Voice Search App, which was initially only available for desktop PCs, Google sent a Siri competitor into the race in 2011. The voice-controlled search came to smartphones in October 2012 and has been further developed through Google Now to the Google Assistant (2016/2017). Like Siri, the latter is an intelligent virtual assistant that does tasks using voice commands.

In the meantime, Microsoft's Cortana saw the light of day in 2014. The voice assistant, which pays homage to the artificial intelligence of the same name from the Halo game series, first appeared on Windows Phone 8.1 and is now also available for Windows 10 and iOS.

Amazon's virtual assistant Alexa has been supporting its users in everyday life since 2015. The voice assistant appeared with the intelligent Amazon Echo loudspeaker and offers a lot of functions that can also be expanded with so-called skills.

In addition to creating to-do lists or playing messages and information, voice assistants can now operate a wide variety of devices. In this way, the lights or heating in your own house can be switched on and off on demand. At the end of 2019, 60 percent of all Germans had already tried the technology. 11 percent even use them every day - and the trend is rising.

Interesting to know: The basis of modern assistance systems was provided by the results of the DARPA program of the 1970s. Above all, the Dragon system developed in the process, which is still used today as the basis for many solutions.

Image source

History of voice assistants

© [Antonioguillem] / stock.adobe.com