Counterfactual Fairness in Text Classification through Robustness
national conference on artificial intelligence, 2019.
In this paper, we study counterfactual fairness in text classification, which asks the question: How would the prediction change if the sensitive attribute discussed in the example were something else? We offer a heuristic for measuring this particular form of fairness in text classifiers by substituting individual tokens pertaining to at...More
Full Text (Upload PDF)