Causal K-Means Clustering
arxiv(2024)
摘要
Causal effects are often characterized with population summaries. These might
provide an incomplete picture when there are heterogeneous treatment effects
across subgroups. Since the subgroup structure is typically unknown, it is more
challenging to identify and evaluate subgroup effects than population effects.
We propose a new solution to this problem: Causal k-Means Clustering, which
harnesses the widely-used k-means clustering algorithm to uncover the unknown
subgroup structure. Our problem differs significantly from the conventional
clustering setup since the variables to be clustered are unknown counterfactual
functions. We present a plug-in estimator which is simple and readily
implementable using off-the-shelf algorithms, and study its rate of
convergence. We also develop a new bias-corrected estimator based on
nonparametric efficiency theory and double machine learning, and show that this
estimator achieves fast root-n rates and asymptotic normality in large
nonparametric models. Our proposed methods are especially useful for modern
outcome-wide studies with multiple treatment levels. Further, our framework is
extensible to clustering with generic pseudo-outcomes, such as partially
observed outcomes or otherwise unknown functions. Finally, we explore finite
sample properties via simulation, and illustrate the proposed methods in a
study of treatment programs for adolescent substance abuse.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要