MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021(2021)
摘要
High quality Machine Learning (ML) models are often considered valuable intellectual property by companies. Model Stealing (MS) attacks allow an adversary with blackbox access to a ML model to replicate its functionality by training a clone model using the predictions of the target model for different inputs. However, best available existing MS attacks fail to produce a high-accuracy clone without access to the target dataset or a representative dataset necessary to query the target model. In this paper, we show that preventing access to the target dataset is not an adequate defense to protect a model. We propose MAZE – a data-free model stealing attack using zeroth-order gradient estimation that produces high-accuracy clones. In contrast to prior works, MAZE uses only synthetic data created using a generative model to perform MS.Our evaluation with four image classification models shows that MAZE provides a normalized clone accuracy in the range of 0.90× to 0.99×, and outperforms even the recent attacks that rely on partial data (JBDA, clone accuracy 0.13× to 0.69×) and on surrogate data (KnockoffNets, clone accuracy 0.52× to 0.97×). We also study an extension of MAZE in the partial-data setting, and develop MAZE-PD, which generates synthetic data closer to the target distribution. MAZE-PD further improves the clone accuracy (0.97× to 1.0×) and reduces the query budget required for the attack by 2×-24×.
更多查看译文
关键词
zeroth-order gradient estimation,intellectual property,blackbox access,target dataset,generative model,partial data,surrogate data,MAZE-PD,data-free model stealing attack,high quality machine learning models,clone model training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要