CLIPath: Fine-tune CLIP with Visual Feature Fusion for Pathology Image Analysis Towards Minimizing Data Collection Efforts

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW(2023)

引用 0|浏览0
暂无评分
摘要
Contrastive Language-Image Pre-training (CLIP) has shown its ability to learn distinctive visual representations and generalize to various downstream vision tasks. However, its applicability in the classification of pathology images with limited labeled data is still under study due to the giant domain shift (between large natural image datasets in the source domain and small-scale target pathology images) and overfitting issues. In this work, we first explore the zero-shot transferability of CLIP on pathology classification tasks and benchmark the performance. Then, we propose Residual Feature Connection (RFC) to fine-tune CLIP with a small amount of trainable parameters. RFC aims to fuse the task-specific knowledge learned from the target domain and the original knowledge pre-trained from CLIP. We show that RFC can adapt pre-trained CLIP to downstream pathology tasks and achieve good performance with just a few annotated samples. Specifically, RFC achieves over 19% improvement in accuracy when only using 0.1% of labeled data in PCam with only 10 minutes of fine-tuning while running on a single GPU.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要