Prioritizing Causation in Decision Trees: A Framework for Interpretable Modeling

Songming Zhang,Xiaofeng Chen,Xuming Ran,Zhongshan Li, Wenming Cao

Engineering Applications of Artificial Intelligence(2024)

引用 0|浏览0
暂无评分
摘要
As a popular machine learning model, decision trees classify and generalize well, but face challenges in engineering applications: 1) Sensitivity to perturbations and lack of interpretability due to correlation reliance. 2) Manual setting of stopping criterion which is unrelated to correlation strength and easily leads to over-partitioning. To address these two challenges, we first theoretically analyze what leads to sub-optimal decision trees. By incorporating causal discovery, this limitation can be attributed to the fact that trees grown with spurious correlations often fall into sub-optimal that lead to overfitting and unfair behaviors. Neglecting causality motivates us to develop a ‘better’ tree with low Kolmogorov complexity and high generalization capability. Then we propose a causality decision tree framework, CausalDT, based on our theoretical expectation, where Hilbert-Schmidt independence criterion serves as a baseline. Unlike previous approaches that prioritize relevance, our framework determines branch nodes based on causation between features, with the significance level determining whether the tree should be expanded further. Experimental results demonstrate that our model maintains performance while reducing average tree depth by 35% on various datasets. Furthermore, our model enhances decision fairness and interpretability.
更多
查看译文
关键词
Decision tree,Causal discovery,Spurious correlation,Interpretability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要