The Impact of Differential Feature Under-reporting on Algorithmic Fairness
CoRR(2024)
摘要
Predictive risk models in the public sector are commonly developed using
administrative data that is more complete for subpopulations that more greatly
rely on public services. In the United States, for instance, information on
health care utilization is routinely available to government agencies for
individuals supported by Medicaid and Medicare, but not for the privately
insured. Critiques of public sector algorithms have identified such
differential feature under-reporting as a driver of disparities in algorithmic
decision-making. Yet this form of data bias remains understudied from a
technical viewpoint. While prior work has examined the fairness impacts of
additive feature noise and features that are clearly marked as missing, the
setting of data missingness absent indicators (i.e. differential feature
under-reporting) has been lacking in research attention. In this work, we
present an analytically tractable model of differential feature under-reporting
which we then use to characterize the impact of this kind of data bias on
algorithmic fairness. We demonstrate how standard missing data methods
typically fail to mitigate bias in this setting, and propose a new set of
methods specifically tailored to differential feature under-reporting. Our
results show that, in real world data settings, under-reporting typically leads
to increasing disparities. The proposed solution methods show success in
mitigating increases in unfairness.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要