PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation
arxiv(2024)
摘要
Imaging modalities such as Computed Tomography (CT) and Positron Emission
Tomography (PET) are key in cancer detection, inspiring Deep Neural Networks
(DNN) models that merge these scans for tumor segmentation. When both CT and
PET scans are available, it is common to combine them as two channels of the
input to the segmentation model. However, this method requires both scan types
during training and inference, posing a challenge due to the limited
availability of PET scans, thereby sometimes limiting the process to CT scans
only. Hence, there is a need to develop a flexible DNN architecture that can be
trained/updated using only CT scans but can effectively utilize PET scans when
they become available. In this work, we propose a parameter-efficient
multi-modal adaptation (PEMMA) framework for lightweight upgrading of a
transformer-based segmentation model trained only on CT scans to also
incorporate PET scans. The benefits of the proposed approach are two-fold.
Firstly, we leverage the inherent modularity of the transformer architecture
and perform low-rank adaptation (LoRA) of the attention weights to achieve
parameter-efficient adaptation. Secondly, since the PEMMA framework attempts to
minimize cross modal entanglement, it is possible to subsequently update the
combined model using only one modality, without causing catastrophic forgetting
of the other modality. Our proposed method achieves comparable results with the
performance of early fusion techniques with just 8
parameters, especially with a remarkable +28
score on PET scans when trained on a single modality.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要