Balancing Fairness and Accuracy in Data-Restricted Binary Classification
arxiv(2024)
摘要
Applications that deal with sensitive information may have restrictions
placed on the data available to a machine learning (ML) classifier. For
example, in some applications, a classifier may not have direct access to
sensitive attributes, affecting its ability to produce accurate and fair
decisions. This paper proposes a framework that models the trade-off between
accuracy and fairness under four practical scenarios that dictate the type of
data available for analysis. Prior works examine this trade-off by analyzing
the outputs of a scoring function that has been trained to implicitly learn the
underlying distribution of the feature vector, class label, and sensitive
attribute of a dataset. In contrast, our framework directly analyzes the
behavior of the optimal Bayesian classifier on this underlying distribution by
constructing a discrete approximation it from the dataset itself. This approach
enables us to formulate multiple convex optimization problems, which allow us
to answer the question: How is the accuracy of a Bayesian classifier affected
in different data restricting scenarios when constrained to be fair? Analysis
is performed on a set of fairness definitions that include group and individual
fairness. Experiments on three datasets demonstrate the utility of the proposed
framework as a tool for quantifying the trade-offs among different fairness
notions and their distributional dependencies.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要