A Subspace Projective Clustering Approach for Backdoor Attack Detection and Mitigation in Deep Neural Networks

IEEE Transactions on Artificial Intelligence(2024)

引用 0|浏览0
暂无评分
摘要
Backdoor attacks in Deep Neural Networks (DNNs) involve an attacker inserting a backdoor into the network by manipulating the training dataset, which causes misclassification of inputs that contain a specific trigger. Detecting and mitigating such attacks is challenging as only the attacker knows the trigger and target class. Our study demonstrates that the representations, i.e., the neuron activations for a given DNN, of poisoned and genuine data lie in different subspaces, which implies there exists a certain subspace where the difference of projections from different data can be manifested. To this end, we propose a method based on subspace projective clustering (SPC), which learns a subspace as well as a projection-based weight vector by solving a projection maximization program, and the optimized weight vector can be utilized in a clustering framework to infer the group of data. Based on our theoretical analysis and experimental results, we demonstrate the effectiveness of our method in defending against backdoor attacks that use different settings of poisoned samples on GTSRB, Imagenet, VGGFace2 and PubFig datasets in comparison with the state-of-the-art methods. Our algorithm can detect more than 90% of the infected classes and identify 95% of the poisoned samples.
更多
查看译文
关键词
Deep Neural Networks (DNNs),Backdoor Attacks,Backdoor Defense,Optimization,Machine Learning Security
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要