The Art of Deception: Black-box Attack Against Text-to-Image Diffusion Model.

Yuetong Lu, Jingyao Xu,Yandong Li,Siyang Lu,Wei Xiang,Wei Lu

International Conference on Parallel and Distributed Systems(2023)

引用 0|浏览0
暂无评分
摘要
With the rise of Foundation models, Text-to-Image models, as one of its important branches, have been increasingly applied. While focusing on the impressive generation capabilities of these models, it is also crucial to pay attention to the robustness of the models against attacks. In this paper, we shift our focus towards studying the vulnerability of Text-to-Image (T2I) models. To this end, we propose a black-box attack method and demonstrate that T2I models are susceptible to adversarial text attacks. Specifically, this method can disrupt T2I models by making subtle modifications to the model’s input (i.e., prompt) without accessing the model parameters, resulting in the generation of incorrect images. It is worth mentioning that we discovered the ability to switch different types of tokenizers within this black-box framework to handle text, furthermore, this attack framework can be applied to target different versions of T2I models. The experiments indicate that the images generated through adversarial text exhibit noticeable errors. We also employ CLIP score, a metric used to evaluate the similarity between images and image descriptions, to assess the results. The findings demonstrate a significant decrease in the visual-textual similarity after the model is subjected to attacks. Additionally, we have identified a specific type of error that T2I models tend to make when facing attacks – when confronted with unrecognizable text, the model often interprets it as human-related content. This paper not only highlights the vulnerability of T2I models to adversarial text attacks but also further discusses potential methods that could enhance the robustness of these attack techniques. This provides a valuable reference for future research directions in this field.
更多
查看译文
关键词
Foundation Model,Text-to-Image,Diffusion ModelAdversarial Attack
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要