The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline
CoRR(2024)
摘要
The commercialization of diffusion models, renowned for their ability to
generate high-quality images that are often indistinguishable from real ones,
brings forth potential copyright concerns. Although attempts have been made to
impede unauthorized access to copyrighted material during training and to
subsequently prevent DMs from generating copyrighted images, the effectiveness
of these solutions remains unverified. This study explores the vulnerabilities
associated with copyright protection in DMs by introducing a backdoor data
poisoning attack (SilentBadDiffusion) against text-to-image diffusion models.
Our attack method operates without requiring access to or control over the
diffusion model's training or fine-tuning processes; it merely involves the
insertion of poisoning data into the clean training dataset. This data,
comprising poisoning images equipped with prompts, is generated by leveraging
the powerful capabilities of multimodal large language models and text-guided
image inpainting techniques. Our experimental results and analysis confirm the
method's effectiveness. By integrating a minor portion of
non-copyright-infringing stealthy poisoning data into the clean
dataset-rendering it free from suspicion-we can prompt the finetuned diffusion
models to produce copyrighted content when activated by specific trigger
prompts. These findings underline potential pitfalls in the prevailing
copyright protection strategies and underscore the necessity for increased
scrutiny and preventative measures against the misuse of DMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要