A CNN-transformer hybrid approach for decoding visual neural activity into text

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE(2022)

引用 11|浏览42
暂无评分
摘要
Background and Objective: Most studies used neural activities evoked by linguistic stimuli such as phrases or sentences to decode the language structure. However, compared to linguistic stimuli, it is more common for the human brain to perceive the outside world through non-linguistic stimuli such as natural images, so only relying on linguistic stimuli cannot fully understand the information perceived by the human brain. To address this, an end-to-end mapping model between visual neural activities evoked by non-linguistic stimuli and visual contents is demanded. Methods: Inspired by the success of the Transformer network in neural machine translation and the convolutional neural network (CNN) in computer vision, here a CNN-Transformer hybrid language decoding model is constructed in an end-to-end fashion to decode functional magnetic resonance imaging (fMRI) signals evoked by natural images into descriptive texts about the visual stimuli. Specifically, this model first encodes a semantic sequence extracted by a two-layer 1D CNN from the multi-time visual neural activity into a multi-level abstract representation, then decodes this representation, step by step, into an English sentence. Results: Experimental results show that the decoded texts are semantically consistent with the corresponding ground truth annotations. Additionally, by varying the encoding and decoding layers and modifying the original positional encoding of the Transformer, we found that a specific architecture of the Transformer is required in this work. Conclusions: The study results indicate that the proposed model can decode the visual neural activities evoked by natural images into descriptive text about the visual stimuli in the form of sentences. Hence, it may be considered as a potential computer-aided tool for neuroscientists to understand the neural mechanism of visual information processing in the human brain in the future. (C) 2021 Elsevier B.V. All rights reserved.
更多
查看译文
关键词
Deep learning, Brain decoding, Transformer, CNN, fMRI
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要