Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models
arxiv(2024)
摘要
Radiologists highly desire fully automated versatile AI for medical imaging
interpretation. However, the lack of extensively annotated large-scale
multi-disease datasets has hindered the achievement of this goal. In this
paper, we explore the feasibility of leveraging language as a naturally
high-quality supervision for chest CT imaging. In light of the limited
availability of image-report pairs, we bootstrap the understanding of 3D chest
CT images by distilling chest-related diagnostic knowledge from an extensively
pre-trained 2D X-ray expert model. Specifically, we propose a language-guided
retrieval method to match each 3D CT image with its semantically closest 2D
X-ray image, and perform pair-wise and semantic relation knowledge
distillation. Subsequently, we use contrastive learning to align images and
reports within the same patient while distinguishing them from the other
patients. However, the challenge arises when patients have similar semantic
diagnoses, such as healthy patients, potentially confusing if treated as
negatives. We introduce a robust contrastive learning that identifies and
corrects these false negatives. We train our model with over 12,000 pairs of
chest CT images and radiology reports. Extensive experiments across multiple
scenarios, including zero-shot learning, report generation, and fine-tuning
processes, demonstrate the model's feasibility in interpreting chest CT images.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要