From thought to speech: Revolutionary technology gives voice to paralysed
Proceeding from a fresh article, Interesting Engineering explores that in a groundbreaking advancement, researchers from UC Berkeley and UC San Francisco have developed a cutting-edge technology that enables individuals with severe paralysis to communicate using “naturalistic speech.”
This breakthrough has the potential to revolutionize communication for people with speech disabilities by allowing them to speak through brain signals in real-time.
For years, speech neuroprostheses have struggled with latency, slowing down communication for paralyzed individuals. However, this new method eliminates that delay, offering quicker and more fluid communication.
The key to this advancement lies in an AI-powered streaming technique. By decoding brain signals directly from the motor cortex, the area of the brain that controls speech, the AI is able to generate audible speech nearly instantly.
“Our streaming approach brings the same rapid speech decoding capacity of devices like Alexa and Siri to neuroprostheses,” said Gopala Anumanchipalli, co-principal investigator of the study. “For the first time, we were able to enable near-synchronous voice streaming, resulting in more natural, fluent speech synthesis.”
The team used AI to interpret brain activity related to speech and translate it into spoken words. They trained the system by having a participant, Ann, silently try to say displayed phrases. AI then generated simulated audio based on her brain activity, without Ann vocalizing at any point.
Unlike previous brain-computer interface (BCI) systems, which suffered from an 8-second delay, this new system achieves near-instantaneous output. “Within 1 second, we’re getting the first sound out,” said Anumanchipalli.
This breakthrough marks a significant step toward making natural speech possible for people with speech paralysis, improving their ability to communicate with the world in real-time. Researchers are now working to further refine the technology, with future plans to add emotional expressiveness to the synthesized voice.
By Naila Huseynova