COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation.


引用 2|浏览45
Personalized text generation has broad industrial applications, such as explanation generation for recommendations, conversational systems, etc. Personalized text generators are usually trained on user written text, e.g., reviews collected on e-commerce platforms. However, due to historical, social, or behavioral reasons, there may exist bias that associates certain linguistic quality of user written text with the users' protected attributes such as gender, race, etc. The generators can identify and inherit these correlations and generate texts discriminately w.r.t. the users' protected attributes. Without proper intervention, such bias can adversarially influence the users' trust and reliance on the system. From a broader perspective, bias in auto-generated contents can reinforce the social stereotypes about how online users write through interactions with the users. In this work, we investigate the fairness of personalized text generation in the setting of explainable recommendation. We develop a general framework for achieving measure-specific counterfactual fairness on the linguistic quality of personalized explanations. We propose learning disentangled representations for counterfactual inference and develop a novel policy learning algorithm with carefully designed rewards for fairness optimization. The framework can be applied for achieving fairness on any given specifications of linguistic quality measures, and can be adapted to most of existing models and real-world settings. Extensive experiments demonstrate the superior ability of our method in achieving fairness while maintaining high generation performance.
AI 理解论文
Chat Paper