CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs' (Lack of) Multicultural Knowledge
arxiv(2024)
摘要
Frontier large language models (LLMs) are developed by researchers and
practitioners with skewed cultural backgrounds and on datasets with skewed
sources. However, LLMs' (lack of) multicultural knowledge cannot be effectively
assessed with current methods for developing benchmarks. Existing multicultural
evaluations primarily rely on expensive and restricted human annotations or
potentially outdated internet resources. Thus, they struggle to capture the
intricacy, dynamics, and diversity of cultural norms. LLM-generated benchmarks
are promising, yet risk propagating the same biases they are meant to measure.
To synergize the creativity and expert cultural knowledge of human annotators
and the scalability and standardizability of LLM-based automation, we introduce
CulturalTeaming, an interactive red-teaming system that leverages human-AI
collaboration to build truly challenging evaluation dataset for assessing the
multicultural knowledge of LLMs, while improving annotators' capabilities and
experiences. Our study reveals that CulturalTeaming's various modes of AI
assistance support annotators in creating cultural questions, that modern LLMs
fail at, in a gamified manner. Importantly, the increased level of AI
assistance (e.g., LLM-generated revision hints) empowers users to create more
difficult questions with enhanced perceived creativity of themselves, shedding
light on the promises of involving heavier AI assistance in modern evaluation
dataset creation procedures. Through a series of 1-hour workshop sessions, we
gather CULTURALBENCH-V0.1, a compact yet high-quality evaluation dataset with
users' red-teaming attempts, that different families of modern LLMs perform
with accuracy ranging from 37.7
multicultural proficiency.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要