MGRW-Transformer: Multigranularity Random Walk Transformer Model for Interpretable Learning

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS(2023)

引用 0|浏览4
暂无评分
摘要
Deep-learning models have been widely used in image recognition tasks due to their strong feature-learning ability. However, most of the current deep-learning models are "black box" systems that lack a semantic explanation of how they reached their conclusions. This makes it difficult to apply these methods to complex medical image recognition tasks. The vision transformer (ViT) model is the most commonly used deep-learning model with a self-attention mechanism that shows the region of influence as compared to traditional convolutional networks. Thus, ViT offers greater interpretability. However, medical images often contain lesions of variable size in different locations, which makes it difficult for a deep-learning model with a self-attention module to reach correct and explainable conclusions. We propose a multigranularity random walk transformer (MGRW-Transformer) model guided by an attention mechanism to find the regions that influence the recognition task. Our method divides the image into multiple subimage blocks and transfers them to the ViT module for classification. Simultaneously, the attention matrix output from the multiattention layer is fused with the multigranularity random walk module. Within the multigranularity random walk module, the segmented image blocks are used as nodes to construct an undirected graph using the attention node as a starting node and guiding the coarse-grained random walk. We appropriately divide the coarse blocks into finer ones to manage the computational cost and combine the results based on the importance of the discovered features. The result is that the model offers a semantic interpretation of the input image, a visualization of the interpretation, and insight into how the decision was reached. Experimental results show that our method improves classification performance with medical images while presenting an understandable interpretation for use by medical professionals.
更多
查看译文
关键词
Graph random walk,interpretable method,multigranularity formal analysis,self-attention mechanism,vision transformer (ViT)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要