New speech recognition software outperforms conventional commercial systems
Published: July 23, 2013
INDIANAPOLIS - Companies could improve the results of voice command technology incorporated in their products thanks to a Purdue Research Park-based firm whose innovation exceeds the accuracy of currently available speech recognition systems.
Michael A. Stokes, president and CEO of Waveform Communication LLC, said automatic speech recognition software is used in several industries.
"Voice command technology is used for medical transcriptions in hospitals and clinics, for hands-free dialing and GPS directions in vehicles, and for voice-activated menu prompts at call centers," he said. "Unfortunately, current technology operates at 70 percent to 80 percent accuracy across all speakers, and people can become frustrated with errors and prompts to repeat information. Automatic speech recognition software still is not widely used because of this lack of performance."
Waveform Communication has developed a new speech recognition engine that correctly identifies 91 percent of vowels across 45 speakers. The engine was built by interns Binhao Lin and Ellen Wongso, students in Purdue University's College of Science. They used a model developed by Stokes that achieved 99.8 percent accuracy on the Peterson and Barney dataset, which is the most-cited database of vowel pronunciations.
The first applications developed from the Waveform Communication engine will focus on "Yes/No" responses from the user, and Stokes said testing could be completed quickly.
"Voice-activated menus can be built from these apps to take users where they need to go by using only these responses," he said. "Even if it takes four steps instead of three to reach the goal, the users will appreciate the error-free experience."
Stokes said the "Yes/No" applications could prepare users for other apps that have larger vocabularies, including numbers.
"Waveform Communication has built this speech recognition engine, which gives us control of the algorithm programming, and we have a working model of vowel recognition," he said. "Because of this, new improvements and developments could happen in hours rather than months or years."
About Waveform Communication LLC
The mission of Waveform Communication LLC is to promote and develop serviceable applications developed from the Waveform Model of Vowel Perception and Production. The objectives include improving speech in noise by 8-10 percent, talker identification from 10-15 ms waveform displays at 100 percent accuracy, speech recognition algorithm development, hearing assistance products, audio products and contributions to communication theory.
About Purdue Research Park
The Purdue Research Park, with four locations across Indiana, has the largest university-affiliated business incubation complex in the country. The parks are home to about 200 companies that employ 4,000 people and are located in West Lafayette, Indianapolis, Merrillville and New Albany.
Purdue Research Park contact: Steve Martin, 765-588-3342, firstname.lastname@example.org
Source: Michael A. Stokes, 317-902-9834, email@example.com