Actionable Auditing Revisited: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products.

AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY(2023)

引用 17|浏览44
暂无评分
摘要
Although algorithmic auditing has emerged as a key strategy to expose systematic biases embedded in software platforms, we struggle to understand the real-world impact of these audits and continue to find it difficult to translate such independent assessments into meaningful corporate accountability. To analyze the impact of publicly naming and disclosing performance results of biased AI systems, we investigate the commercial impact of Gender Shades, the first algorithmic audit of gender- and skin-type performance disparities in commercial facial analysis models. This paper (1) outlines the audit design and structured disclosure procedure used in the Gender Shades study, (2) presents new performance metrics from targeted companies such as IBM, Microsoft, and Megvii (Face++) on the Pilot Parliaments Benchmark (PPB) as of August 2018, (3) provides performance results on PPB by non-target companies such as Amazon and Kairos, and (4) explores differences in company responses as shared through corporate communications that contextualize differences in performance on PPB. Within 7 months of the original audit, we find that all three targets released new application program interface (API) versions. All targets reduced accuracy disparities between males and females and darker- and lighter-skinned subgroups, with the most significant update occurring for the darker-skinned female subgroup that underwent a 17.7--30.4% reduction in error between audit periods. Minimizing these disparities led to a 5.72--8.3% reduction in overall error on the Pilot Parliaments Benchmark (PPB) for target corporation APIs. The overall performance of non-targets Amazon and Kairos lags significantly behind that of the targets, with error rates of 8.66% and 6.60% overall, and error rates of 31.37% and 22.50% for the darker female subgroup, respectively. This is an expanded version of an earlier publication of these results, revised for a more general audience, and updated to include commentary on further developments.
更多
查看译文
关键词
Ethics, Machine Learning, Artificial Intelligence, Facial Recognition, Commercial Applications, Fairness, Computer Vision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要