Statistical Analytics and Regional Representation Learning for COVID-19 Pandemic Understanding (Preprint)

JMIR Preprints(2020)

引用 5|浏览19
暂无评分
摘要
<title>BACKGROUND</title> <p>The rapid spread of the novel coronavirus (COVID-19) has severely impacted almost all countries around the world. It not only has caused a tremendous burden on health-care providers to bear, but it also brought severe impacts on the economy and social life. The presence of reliable data and the results of in-depth statistical analyses provide researchers and policymakers with invaluable information to understand this pandemic and its growth pattern more clearly.</p> <title>OBJECTIVE</title> <p>This study aims to gather an extensive collection of regional features along with COVID-19 pandemic patterns across the United States and to design statistical methodologies and machine learning pipelines to provide means for a more thorough understanding of the patterns and enable the use of Artificial Intelligence.</p> <title>METHODS</title> <p>This paper processes and combines an extensive collection of publicly available datasets to provide a unified information source for representing geographical regions with regards to their pandemic-related behavior. The features are grouped into various categories to account for their impact based on the higher-level concepts associated with them. This work uses several correlation analysis techniques to observe value and order relationships between features, feature groups, and COVID-19 occurrences. Dimensionality reduction techniques and projection methodologies are used to elaborate on individual and group importance of these representative features. In addition, a specific RNN-based inference pipeline called DoubleWindowLSTM-CP is designed in this work for predictive event modeling, thus utilizing sequential patterns as well as enabling concise record representation.</p> <title>RESULTS</title> <p>The primarily quantitative results of our statistical analytics indicated critical patterns reflecting on many of the expected collective behavior and their associated outcomes. As an example, the 41% Pearson correlation indicates a well-defined relationship between the proportion of public transit in the methods of commute to work and the daily number of confirmed cases. Regarding deep learning, our DoubleWindowLSTM-CP instance with the time window of t=10 days exhibits clear training convergence and efficient prediction results.</p> <title>CONCLUSIONS</title> <p>Regional features can be leveraged along with the pandemic patterns to enable efficient predictive modeling, and thus help the researchers and policy makers to a great extent with a more in-depth knowledge of the pandemic patterns. We have made the results of this study, along with our publicly available codes and platform, to help expedite the research in this area.</p>
更多
查看译文
关键词
COVID-19,Dimensionality reduction,Pandemics,Statistical analysis,Pipelines,Medical services,Predictive models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要