Manipulating Out-Domain Uncertainty Estimation in Deep Neural Networks via Targeted Clean-Label Poisoning

PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023(2023)

引用 0|浏览5
暂无评分
摘要
Robust out-domain uncertainty estimation has gained growing attention for its capacity of providing adversary-resistant uncertainty estimates on out-domain samples. However, existing work on robust uncertainty estimation mainly focuses on evasion attacks that happen during test time. The threat of poisoning attacks against uncertainty models is largely unexplored. Compared to evasion attacks, poisoning attacks do not necessarily modify test data, and therefore, would be more practical in real-world applications. In this work, we systematically investigate the robustness of state-of-the-art uncertainty estimation algorithms against data poisoning attacks, with the ultimate objective of developing robust uncertainty training methods. In particular, we focus on attacking the out-domain uncertainty estimation. Under the proposed attack, the training process of models is affected. A fake high-confidence region is established around the targeted out-domain sample, which originally would have been rejected by the model due to low confidence. More fatally, our attack is clean-label and targeted: it leaves the poisoned data with clean labels and attacks a specific targeted test sample without degrading the overall model performance. We evaluate the proposed attack on several image benchmark datasets and a real-world application of COVID-19 misinformation detection. The extensive experimental results on different tasks suggest that the state-of-the-art uncertainty estimation methods could be extremely vulnerable and easily corrupted by our proposed attack.
更多
查看译文
关键词
Out-Domain Detection,Uncertainty Estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要