Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation
CoRR(2024)
摘要
Model adaptation tackles the distribution shift problem with a pre-trained
model instead of raw data, becoming a popular paradigm due to its great privacy
protection. Existing methods always assume adapting to a clean target domain,
overlooking the security risks of unlabeled samples. In this paper, we explore
the potential backdoor attacks on model adaptation launched by well-designed
poisoning target data. Concretely, we provide two backdoor triggers with two
poisoning strategies for different prior knowledge owned by attackers. These
attacks achieve a high success rate and keep the normal performance on clean
samples in the test stage. To defend against backdoor embedding, we propose a
plug-and-play method named MixAdapt, combining it with existing adaptation
algorithms. Experiments across commonly used benchmarks and adaptation methods
demonstrate the effectiveness of MixAdapt. We hope this work will shed light on
the safety of learning with unlabeled data.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要