Comparative evaluation of recent universal adversarial perturbations in image classification


引用 0|浏览31
The vulnerability of Convolutional Neural Networks (CNNs) to adversarial samples has recently garnered significant attention in the machine learning community. Furthermore, recent studies have unveiled the existence of universal adversarial perturbations (UAPs) which are image-agnostic and highly transferable across different CNN models. In this survey, our primary focus revolves around the recent advancements in UAPs specifically within the image classification task. We categorize UAPs into two distinct categories, i.e., noise-based attacks and generator-based attacks, thereby providing a comprehensive overview of representative methods within each category. After presenting the computational details of these methods, we summarize their strengths and weaknesses. Next, we also summarize various loss functions employed for learning UAPs. Furthermore, we conduct a comprehensive evaluation of different loss functions within consistent training frameworks, including noise-based and generator-based. The evaluation covers a wide range of attack settings, including black-box and white-box attacks, targeted and untargeted attacks, as well as the examination of defense mechanisms. Our quantitative evaluation results yield several important findings pertaining to the effectiveness of different loss functions, the selection of surrogate CNN models, the impact of training data and data size, and the training frameworks involved in crafting universal attackers. Additionally, we provide visualizations of the perturbations. Finally, we delineate potential avenues for future research in three key areas: Crafting UAPs, Understanding UAPs, and Defending against UAPs.
Adversarial attacks,Universal adversarial perturbations,Black-box attacks,Untargeted attacks,Targeted attacks
AI 理解论文