A General Methodology to Quantify Biases in Natural Language Data
CHI '20: CHI Conference on Human Factors in Computing Systems Honolulu HI USA April, 2020(2020)
摘要
Biases in data, such as gender and racial stereotypes, are propagated through intelligent systems and amplified at end-user applications. Existing studies detect and quantify biases based on pre-defined attributes. However, in real practices, it is difficult to gather a comprehensive list of sensitive concepts for various categories of biases. We propose a general methodology to quantify dataset biases by measuring the difference of its data distribution with a reference dataset using Maximum Mean Discrepancy. For the case of natural language data, we show that lexicon-based features quantify explicit stereotypes, while deep learning-based features further capture implicit stereotypes represented by complex semantics. Our method provides a more flexible way to detect potential biases.
更多查看译文
关键词
Quantify bias, natural language data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要