Google AI researchers working with the ALS Remedy Improvement Institute as we speak shared particulars about Undertaking Euphonia, a speech-to-text transcription service for folks with talking impairments. In addition they say their method can enhance computerized speech recognition for folks with non-native English accents as effectively.
Individuals with amyotrophic lateral sclerosis (ALS) typically have slurred speech, however present AI methods are usually educated on voice knowledge with none affliction or accent.
The brand new method is profitable primarily as a result of introduction of small quantities of information that represents folks with accents and ALS.
“We present that 71% of the development comes from solely 5 minutes of coaching knowledge,” in response to a paper printed on arXiv July 31 titled “Personalizing ASR for Dysarthric and Accented Speech with Restricted Information.”
Customized fashions have been in a position to obtain 62% and 35% relative phrase error fee (WER) enchancment for ALS and accents respectively.
The ALS speech knowledge set consists of 36 hours of audio from 67 folks with ALS, working with the ALS Remedy Improvement Institute.
The non-native English speaker knowledge set is known as L2 Arctic and has 20 recordings of utterances that final one hour every.
Undertaking Euphonia additionally makes use of methods from Parrotron, an AI software for folks with speech impediments launched in July, in addition to fine-tuning methods.
Written by 12 coauthors, the work is being introduced at Worldwide Speech Communication Affiliation, or Interspeech 2019, which takes place September 15-19 in Graz, Austria.
“This paper’s method overcomes knowledge shortage by starting with a base mannequin educated on hundreds of hours of normal speech. It will get round sub-group heterogeneity by coaching customized fashions,” the paper reads.
The analysis, which a Google AI weblog submit highlighted as we speak, follows the introduction of Undertaking Euphonia and different initiatives in Could, corresponding to Dwell Relay, a function to make telephone calls simpler for deaf folks, and Undertaking Diva, an effort to make Google Assistant accessible for nonverbal folks.
Google is soliciting knowledge from folks with ALS to enhance its mannequin’s accuracy and is engaged on subsequent steps for Undertaking Euphonia, corresponding to utilizing phoneme errors to scale back phrase error charges.