Enhancing the Antidote: Improved Pointwise Certifications against Poisoning Attacks
AAAI(2023)
摘要
Poisoning attacks can disproportionately influence model behaviour by making
small changes to the training corpus. While defences against specific poisoning
attacks do exist, they in general do not provide any guarantees, leaving them
potentially countered by novel attacks. In contrast, by examining worst-case
behaviours Certified Defences make it possible to provide guarantees of the
robustness of a sample against adversarial attacks modifying a finite number of
training samples, known as pointwise certification. We achieve this by
exploiting both Differential Privacy and the Sampled Gaussian Mechanism to
ensure the invariance of prediction for each testing instance against finite
numbers of poisoned examples. In doing so, our model provides guarantees of
adversarial robustness that are more than twice as large as those provided by
prior certifications.
更多查看译文
关键词
improved pointwise certifications,antidote
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要