Comparing the performance of eight imputation methods for propensity score matching in missing data problem

JOURNAL OF STATISTICS AND MANAGEMENT SYSTEMS(2023)

引用 0|浏览2
暂无评分
摘要
Propensity score (PS) is a popular method to control for covariates in observational studies. A challenge in PS analyses is missing values in covariates. This study aims to investigate how different imputation methods of handling missing values of covariates in a PS analysis can affect average treatment on the treated (ATT) estimates. In this study, missing data imputation methods were evaluated using different data sets, whose covariates were low, medium, and high (r=0.10, 0.50, 0.85) correlated with each other, for n=200 units and 1000 times running simulation. Missing data structures were created according to the missing at random (MAR) mechanism and different missing rates. Different datasets were obtained after having imputed the missing values separately by eight imputation methods including mean, median, mode, hot deck, last observation carried forward (LOCF), next observation carried backward (NOCB), regression and predictive mean matching (PMM). Then the PS nearest neighbor matching was implemented and ATT scores were obtained using the imputed data sets. The predictive performance of imputation methods was compared according to ATT scores by hierarchical cluster analysis with Euclidean distance complete linkage. ATT scores of regression and PMM methods were closer to each other and these methods showed the best predictive performance. Additionally, when there were larger amounts of missing data, the PMM was the best method of choice. Ignoring missing values on covariates for PS analyses causes information loss significantly and this information loss becomes greater as the rate of missing data increases. PS analyses might be biased if missing data on covariates are also ignored. To prevent this information loss and bias, PS analyses should be performed after solving the problem of missing data with MAR mechanism on covariates by regression and PMM methods, which showed statistical superiority compared to other methods in this study.
更多
查看译文
关键词
Missing data, Imputation, Simulation, Propensity score, Hierarchical clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要