Explanation-based data-free model extraction attacks

WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS(2023)

引用 0|浏览17
暂无评分
摘要
Deep learning (DL) has dramatically pushed the previous limits of various tasks, ranging from computer vision to natural language processing. Despite its success, the lack of model explanations thwarts the usage of these techniques in life-critical domains, e.g., medical diagnosis and self-driving systems. To date, the core technology to solve the explainable issue is explainable artificial intelligence (XAI). XAI methods have been developed to produce human-understandable explanations by leveraging intermediate results of the DL models, e.g., gradients and model parameters. While the effectiveness of XAI methods has been demonstrated in benign environments, their privacy against model extraction attacks (i.e., attacks at the model confidentially) requires to be studied. To this end, this paper proposes DMEAE, a d ata-free m odel e xtraction a ttack using e xplanation-guided, to explore XAI privacy threats. Compared with previous works, DMEAE does not require collecting any data and utilizes model explanation loss. Specifically, DMEAE creates synthetic data using a generative model with model explanation loss items. Extensive evaluations verify the effectiveness and efficiency of the proposed attack strategy on SVHN and CIFAR-10 datasets. We hope that our research can provide insights for the development of practical tools to trade off the relationship between privacy and model explanations.
更多
查看译文
关键词
Deep neural network,Model explanation,Black-box,Model extraction attack
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要