speech recognition from FOLDOC

speech recognition

(Or voice recognition) The identification of spoken words by a machine. The spoken words are digitised (turned into sequence of numbers) and matched against coded dictionaries in order to identify the words.

Most systems must be "trained," requiring samples of all the actual words that will be spoken by the user of the system. The sample words are digitised, stored in the computer and used to match against future words. More sophisticated systems require voice samples, but not of every word. The system uses the voice samples in conjunction with dictionaries of larger vocabularies to match the incoming words. Yet other systems aim to be "speaker-independent", i.e. they will recognise words in their vocabulary from any speaker without training.

Another variation is the degree with which systems can cope with connected speech. People tend to run words together, e.g. "next week" becomes "neksweek" (the "t" is dropped). For a voice recognition system to identify words in connected speech it must take into account the way words are modified by the preceding and following words.

It has been said (in 1994) that computers will need to be something like 1000 times faster before large vocabulary (a few thousand words), speaker-independent, connected speech voice recognition will be feasible.

Last updated: 1995-05-05

Nearby terms:

speech recognition ♦ Speech Recognition Application Program Interface

Try this search on Wikipedia, Wiktionary, Google, OneLook.