VMEKNet: Visual Memory and External Knowledge Based Network for Medical Report Generation.

Pacific Rim International Conference on Artificial Intelligence (PRICAI)(2022)

引用 1|浏览7
暂无评分
摘要
The main purpose of the medical report generation task is to generate a medical report corresponding to a given medical image, which contains detailed information of body parts and diagnostic results from radiologists. The task not only greatly reduces the workload of radiologists, but also helps patients get medical treatment in time. However, there are still many limitations in this task. First, the gap between image semantic features and text semantic features hinders the accuracy of the generated medical reports. Second, there are a large number of similar features in different medical images, which are not utilized efficiently and adequately. In order to solve the problems mentioned above, we propose a medical report generation model VMEKNet that integrates visual memory and external knowledge into the task. Specifically, we propose two novel modules and introduce them into medical report generation. Among them, the TF-IDF Embedding (TIE) module incorporates external knowledge into the feature extraction stage via the TF-IDF algorithm, and the Visual Memory (VIM) module makes full use of previous image features to help the model extract more accurate medical image features. After that, a standard Transformer processes the image features and text features then generates full medical reports. Experimental results on benchmark datasets, IU X-Ray, have demonstrated that our proposed model outperforms previous works on both natural language generation metrics and practical clinical diagnosis.
更多
查看译文
关键词
Medical report generation,Transformer,TF-IDF algorithm,Visual memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要