Predictors from causal features do not generalize better to new domains
CoRR(2024)
摘要
We study how well machine learning models trained on causal features
generalize across domains. We consider 16 prediction tasks on tabular datasets
covering applications in health, employment, education, social benefits, and
politics. Each dataset comes with multiple domains, allowing us to test how
well a model trained in one domain performs in another. For each prediction
task, we select features that have a causal influence on the target of
prediction. Our goal is to test the hypothesis that models trained on causal
features generalize better across domains. Without exception, we find that
predictors using all available features, regardless of causality, have better
in-domain and out-of-domain accuracy than predictors using causal features.
Moreover, even the absolute drop in accuracy from one domain to the other is no
better for causal predictors than for models that use all features. If the goal
is to generalize to new domains, practitioners might as well train the best
possible model on all available features.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要