A Neural Speech Decoding Framework Leveraging Deep Learning and Speech Synthesis.

bioRxiv : the preprint server for biology(2023)

引用 0|浏览9
暂无评分
摘要
Decoding human speech from neural signals is essential for brain-computer interface (BCI) technologies restoring speech function in populations with neurological deficits. However, it remains a highly challenging task, compounded by the scarce availability of neural signals with corresponding speech, data complexity, and high dimensionality, and the limited publicly available source code. Here, we present a novel deep learning-based neural speech decoding framework that includes an ECoG Decoder that translates electrocorticographic (ECoG) signals from the cortex into interpretable speech parameters and a novel differentiable Speech Synthesizer that maps speech parameters to spectrograms. We develop a companion audio-to-audio auto-encoder consisting of a Speech Encoder and the same Speech Synthesizer to generate reference speech parameters to facilitate the ECoG Decoder training. This framework generates natural-sounding speech and is highly reproducible across a cohort of 48 participants. Among three neural network architectures for the ECoG Decoder, the 3D ResNet model has the best decoding performance (PCC=0.804) in predicting the original speech spectrogram, closely followed by the SWIN model (PCC=0.796). Our experimental results show that our models can decode speech with high correlation even when limited to only causal operations, which is necessary for adoption by real-time neural prostheses. We successfully decode speech in participants with either left or right hemisphere coverage, which could lead to speech prostheses in patients with speech deficits resulting from left hemisphere damage. Further, we use an occlusion analysis to identify cortical regions contributing to speech decoding across our models. Finally, we provide open-source code for our two-stage training pipeline along with associated preprocessing and visualization tools to enable reproducible research and drive research across the speech science and prostheses communities.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要