Friday, September 23, 2016

Speech Recognition Software

Speech recognition Software is incredibly prevalent in society today, with uses that include for computer dictation software, automated call centers, digital voices for those with disabilities, and smartphone "voices" like Siri. What these all have in common is that they work by somehow analyzing the "phones," or sound packets that our voices generate, in different ways. Phones are the real bits of sound that we speak. Speech recognition software is highly complex because of the difficulty of its task. There are many obstacles it runs into: how to separate slurred words, how to differentiate between voices and pitch, how to put homophobes in context, a sentence that is misheard, etc.


There are a options for how different speech recognition softwares work. The first option is pattern matching: where a spoken word is recognized entirely. The second option is Pattern and feature analysis: where a work is broken into parts and each part is analyzed and recognized for key features like the vowels. Another option is language modeling and statistical analysis: where speech recognition speed and accuracy are improved by a knowledge of grammar and the probability of certain sounds in a specific order. The last option is artificial neural networks: brain-like computer models that are trained to recognize patterns.

In general, speech recognition softwares first operates by using an analog-to-digital converter (ADC) to translate the vibration waves that you generate by speaking into digitized sound by measuring the wave at intervals. It removes unwanted noise and background noise and separates the pitch, or frequency bands, of the speech. It adjusts the sound volume to a steady level, and divides the signal into pieces (the phones) into known phonemes, or tiny sounds, in a specific language. There are about 44 phonemes in the english language. It runs the phoneme through a statistical model to determine what the user was saying and usually outputs it as text.


References:

Content:
https://en.wikipedia.org/wiki/Speech_recognition
http://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speech-recognition.htm
http://www.explainthatstuff.com/voicerecognition.html
https://msdn.microsoft.com/en-us/library/hh378337(v=office.14).aspx
http://scienceline.org/2014/08/ever-wondered-how-does-speech-to-text-software-work/

Images:
http://efv-solutions.com/wp-content/uploads/2015/09/Home.png
http://rethink-wireless.com/wp-content/uploads/2016/05/Siri_3359038b.jpg

No comments:

Post a Comment