Leveraging machine learning methods to predict COVID-19 vulnerability in U.S. counties based on socioeconomic factors

Katharine Emily Lee, Cynthia Denise Lo, William Ren Xu,Robert Ye, Christy Tomkins-Lane

STEM Fellowship Journal(2022)

引用 0|浏览1
暂无评分
摘要
As COVID-19 gained pandemic status, the number of confirmed cases in the US surpassed that of all other countries. Although the virus spread throughout the US, not all areas were affected equally. This retrospective study aims to explore these inequalities through pre-pandemic socioeconomic characteristics by attempting to create a predictive model for COVID-19 vulnerability at the county level. A total of 103 features of socioeconomic data for 2610 US counties (out of a total of 3007) were sourced from various online databases such as the US Census Bureau, the US Department of Agriculture, and the Association of American Medical Colleges. Additionally, to quantify each county’s COVID-19 vulnerability, we defined 3 custom measures: incidence, mortality, and case fatality. These measurements were calculated using case and death data taken 29 days after each county’s first case. Machine learning classification algorithms – including random forest, multi-layer perceptron neural network and XGBoost – were then used to predict the incidence, mortality, and case fatality of US counties. Through analysis, we were able to predict a county’s COVID-19 incidence with ~47% accuracy, mortality with ~59% accuracy, and case fatality with ~61% accuracy by looking solely at pre-pandemic socioeconomic factors. A list of important features was extracted using a built-in XGBoost function for each vulnerability measure (incidence, mortality, and case fatality). Many of these features are typically associated with pandemic spread (e.g., population density and medical infrastructure), while other features were unexpected (e.g., education) and warrant further studies to identify their role in disease propagation. Furthermore, the difficulties our model experienced support the notion that region-specific policies play an important role in successfully mitigating this crisis. The moderate success achieved in this study proves the feasibility of using classifiers as a pandemic preparedness evaluation tool.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要