A Deep Learning Approach to Forecast Short-Term COVID-19 Cases and Deaths in the US

medrxiv(2022)

引用 0|浏览2
暂无评分
摘要
Since the US reported its first COVID-19 case on January 21, 2020, the science community has been applying various techniques to forecast incident cases and deaths. To date, providing an accurate and robust forecast at a high spatial resolution has proved challenging, even in the short term. Here we present a novel multi-stage deep learning model to forecast the number of COVID-19 cases and deaths for each US state at a weekly level for a forecast horizon of 1 to 4 weeks. The model is heavily data driven, and relies on epidemiological, mobility, survey, climate, and demographic. We further present results from a case study that incorporates SARS-CoV-2 genomic data (i.e. variant cases) to demonstrate the value of incorporating variant cases data into model forecast tools. We implement a rigorous and robust evaluation of our model – specifically we report on weekly performance over a one-year period based on multiple error metrics, and explicitly assess how our model performance varies over space, chronological time, and different outbreak phases. The proposed model is shown to consistently outperform the CDC ensemble model for all evaluation metrics in multiple spatiotemporal settings, especially for the longer-term (3 and 4 weeks ahead) forecast horizon. Our case study also highlights the potential value of virus genomic data for use in short-term forecasting to identify forthcoming surges driven by new variants. Based on our findings, the proposed forecasting framework improves upon the available forecasting tools currently used to support public health decision making with respect to COVID-19 risk. Evidence before this study A systematic review of the COVID-19 forecasting and the EPIFORGE 2020 guidelines reveal the lack of consistency, reproducibility, comparability, and quality in the current COVID-19 forecasting literature. To provide an updated survey of the literature, we carried out our literature search on Google Scholar, PubMed, and medRxi , using the terms “Covid-19,” “SARS-CoV-2,” “coronavirus,” “short-term,” “forecasting,” and “genomic surveillance.” Although the literature includes a significant number of papers, it remains lacking with respect to rigorous model evaluation, interpretability and translation. Furthermore, while SARS-CoV-2 genomic surveillance is emerging as a vital necessity to fight COVID-19 (i.e. wastewater sampling and airport screening), to our knowledge, no published forecasting model has illustrated the value of virus genomic data for informing future outbreaks. Added value of this study We propose a multi-stage deep learning model to forecast COVID-19 cases and deaths with a horizon window of four weeks. The data driven model relies on a comprehensive set of input features, including epidemiological, mobility, behavioral survey, climate, and demographic. We present a robust evaluation framework to systematically assess the model performance over a one-year time span, and using multiple error metrics. This rigorous evaluation framework reveals how the predictive accuracy varies over chronological time, space, and outbreak phase. Further, a comparative analysis against the CDC ensemble, the best performing model in the COVID-19 ForecastHub, shows the model to consistently outperform the CDC ensemble for all evaluation metrics in multiple spatiotemporal settings, especially for the longer forecasting windows. We also conduct a feature analysis, and show that the role of explanatory features changes over time. Specifically, we note a changing role of climate variables on model performance in the latter half of the study period. Lastly, we present a case study that reveals how incorporating SARS-CoV-2 genomic surveillance data may improve forecasting accuracy compared to a model without variant cases data. Implications of all the available evidence Results from the robust evaluation analysis highlight extreme model performance variability over time and space, and suggest that forecasting models should be accompanied with specifications on the conditions under which they perform best (and worst), in order to maximize their value and utility in aiding public health decision making. The feature analysis reveals the complex and changing role of factors contributing to COVID-19 transmission over time, and suggests a possible seasonality effect of climate on COVID-19 spread, but only after August 2021. Finally, the case study highlights the added value of using genomic surveillance data in short-term epidemiological forecasting models, especially during the early stage of new variant introductions. ### Competing Interest Statement The authors have declared no competing interest. ### Funding Statement This work was funded the NSF Rapid Response Research (RAPID) grant Award ID 2108526 and the CDC Contract #75D30120C09570. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable. Yes All data produced in the present study are available upon reasonable request to the authors.
更多
查看译文
关键词
deep learning,deep learning approach,forecast,short-term
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要