Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You
CoRR(2024)
摘要
Text-to-image generation models have recently achieved astonishing results in
image quality, flexibility, and text alignment and are consequently employed in
a fast-growing number of applications. Through improvements in multilingual
abilities, a larger community now has access to this kind of technology. Yet,
as we will show, multilingual models suffer similarly from (gender) biases as
monolingual models. Furthermore, the natural expectation is that these models
will provide similar results across languages, but this is not the case and
there are important differences between languages. Thus, we propose a novel
benchmark MAGBIG intending to foster research in multilingual models without
gender bias. We investigate whether multilingual T2I models magnify gender bias
with MAGBIG. To this end, we use multilingual prompts requesting portrait
images of persons of a certain occupation or trait (using adjectives). Our
results show not only that models deviate from the normative assumption that
each gender should be equally likely to be generated, but that there are also
big differences across languages. Furthermore, we investigate prompt
engineering strategies, i.e. the use of indirect, neutral formulations, as a
possible remedy for these biases. Unfortunately, they help only to a limited
extent and result in worse text-to-image alignment. Consequently, this work
calls for more research into diverse representations across languages in image
generators.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要