Understanding and Mitigating Spurious Correlations in Text Classification with Neighborhood Analysis
arXiv (Cornell University)(2023)
摘要
Recent research has revealed that machine learning models have a tendency to
leverage spurious correlations that exist in the training set but may not hold
true in general circumstances. For instance, a sentiment classifier may
erroneously learn that the token "performances" is commonly associated with
positive movie reviews. Relying on these spurious correlations degrades the
classifiers performance when it deploys on out-of-distribution data. In this
paper, we examine the implications of spurious correlations through a novel
perspective called neighborhood analysis. The analysis uncovers how spurious
correlations lead unrelated words to erroneously cluster together in the
embedding space. Driven by the analysis, we design a metric to detect spurious
tokens and also propose a family of regularization methods, NFL (doN't Forget
your Language) to mitigate spurious correlations in text classification.
Experiments show that NFL can effectively prevent erroneous clusters and
significantly improve the robustness of classifiers without auxiliary data. The
code is publicly available at
https://github.com/oscarchew/doNt-Forget-your-Language.
更多查看译文
关键词
neighborhood analysis,spurious correlations,classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要