1,480
Views
7
CrossRef citations to date
0
Altmetric
COVID-19

The Interplay of Demographic Variables and Social Distancing Scores in Deep Prediction of U.S. COVID-19 Cases

, ORCID Icon, &
Pages 492-506 | Received 04 Jan 2021, Accepted 05 Mar 2021, Published online: 27 Apr 2021
 

Abstract

With the severity of the COVID-19 outbreak, we characterize the nature of the growth trajectories of counties in the United States using a novel combination of spectral clustering and the correlation matrix. As the United States and the rest of the world are still suffering from the effects of the virus, the importance of assigning growth membership to counties and understanding the determinants of the growth is increasingly evident. For the two communities (faster versus slower growth trajectories) we cluster the counties into, the average between-group correlation is 88.4% whereas the average within-group correlations are 95.0% and 93.8%. The average growth rate for one group is 0.1589 and 0.1704 for the other, further suggesting that our methodology captures meaningful differences between the nature of the growth across various counties. Subsequently, we select the demographic features that are most statistically significant in distinguishing the communities: number of grocery stores, number of bars, Asian population, White population, median household income, number of people with the bachelor’s degrees, and population density. Lastly, we effectively predict the future growth of a given county with a long short-term memory (LSTM) recurrent neural network using three social distancing scores. The best-performing model achieves a median out-of-sample R2 of 0.6251 for a four-day ahead prediction and we find that the number of communities and social distancing features play an important role in producing a more accurate forecasting. This comprehensive study captures the nature of the counties’ growth in cases at a very micro-level using growth communities, demographic factors, and social distancing performance to help government agencies utilize known information to make appropriate decisions regarding which potential counties to target resources and funding to. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

Supplementary Material

Supplementary material provides the algorithm and results when clustering using the regular adjacency matrices as described in Section 2.2.

Acknowledgments

We are grateful to the editor, the AE, and anonymous reviewers for their insightful comments which have greatly improved the scope and quality of the paper. We would like to thank Unacast Inc. for providing us with their extensive social distancing data.

Additional information

Funding

Tang was supported by NSF Grant DMS-1712591. Feng was supported by NSF Grants DMS-2013789 and DMS-2034022. Fan was supported by NIH funding: Grant 5R01-GM072611-16.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 343.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.