Image Counterfactual Sensitivity Analysis for Detecting Unintended Bias
arxiv(2019)
摘要
Facial analysis models are increasingly used in applications that have
serious impacts on people's lives, ranging from authentication to surveillance
tracking. It is therefore critical to develop techniques that can reveal
unintended biases in facial classifiers to help guide the ethical use of facial
analysis technology. This work proposes a framework called image
counterfactual sensitivity analysis, which we explore as a proof-of-concept in
analyzing a smiling attribute classifier trained on faces of celebrities. The
framework utilizes counterfactuals to examine how a classifier's prediction
changes if a face characteristic slightly changes. We leverage recent advances
in generative adversarial networks to build a realistic generative model of
face images that affords controlled manipulation of specific image
characteristics. We then introduce a set of metrics that measure the effect of
manipulating a specific property on the output of the trained classifier.
Empirically, we find several different factors of variation that affect the
predictions of the smiling classifier. This proof-of-concept demonstrates
potential ways generative models can be leveraged for fine-grained analysis of
bias and fairness.
更多查看译文
关键词
Generative model,Classifier (UML),Counterfactual thinking,Counterfactual conditional,Machine learning,Generative grammar,Computer science,Artificial intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要