Severity Controlled Text-to-Image Generative Model Bias Manipulation
arxiv(2024)
摘要
Text-to-image (T2I) generative models are gaining wide popularity, especially
in public domains. However, their intrinsic bias and potential malicious
manipulations remain under-explored. Charting the susceptibility of T2I models
to such manipulation, we first expose the new possibility of a dynamic and
computationally efficient exploitation of model bias by targeting the embedded
language models. By leveraging mathematical foundations of vector algebra, our
technique enables a scalable and convenient control over the severity of output
manipulation through model bias. As a by-product, this control also allows a
form of precise prompt engineering to generate images which are generally
implausible with regular text prompts. We also demonstrate a constructive
application of our manipulation for balancing the frequency of generated
classes - as in model debiasing. Our technique does not require training and is
also framed as a backdoor attack with severity control using semantically-null
text triggers in the prompts. With extensive analysis, we present interesting
qualitative and quantitative results to expose potential manipulation
possibilities for T2I models.
Key-words: Text-to-Image Models, Generative Models, Backdoor Attacks, Prompt
Engineering, Bias
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要