113
Views
4
CrossRef citations to date
0
Altmetric
Original Research

Beyond Predicting the Number of Infections: Predicting Who is Likely to Be COVID Negative or Positive

ORCID Icon, ORCID Icon, ORCID Icon, , ORCID Icon, ORCID Icon & ORCID Icon show all
Pages 2811-2818 | Published online: 03 Dec 2020

Abstract

Background

This study aims to identify individuals’ likelihood of being COVID negative or positive, enabling more targeted infectious disease prevention and control when there is a shortage of COVID-19 testing kits.

Methods

We conducted a primary survey of 521 adults on April 1–10, 2020 in Iran, where 3% reported being COVID-19 positive and 15% were unsure whether they were infected. This relatively high positive rate enabled us to conduct the analysis at the 5% significance level.

Results

Adults who exercised more were more likely to be COVID-19 negative. Each additional hour of exercise per day predicted a 78% increase in the likelihood of being COVID-19 negative. Adults with chronic health issues were 48% more likely to be COVID-19 negative. Those working from home were the most likely to be COVID-19 negative, and those who had stopped working due to the pandemic were the most likely to be COVID-19 positive. Adults employed in larger organizations were less likely to be COVID-19 positive.

Conclusion

This study enables more targeted infectious disease prevention and control by identifying the risk factors of COVID-19 infections from a set of readily accessible information. We hope this research opens a new research avenue to predict the individual likelihood of COVID-19 infection by risk factors.

Introduction

COVID-19 (coronavirus disease 2019) is overwhelming clinical capacities in many countries. To contain the spread of COVID-19, we need to increase efforts to identify and isolate people who are more likely to be infected early on. The Director-General of WHO had a simple message for all countries: “test, test, test”.Citation1 However, even many developed countries are experiencing severe shortages of test kits.Citation2 One way to overcome the limited testing capacities is to supplement it using other techniques to identify the groups of people who are at greater risk of contracting COVID-19. Early identification based on information is of utmost importance to enable more targeted infectious disease prevention, communication, testing, and control. The identification of higher risk groups can reduce the risk of spreading virus not only to these individuals but also to the medical system and society at large.

Unfortunately, we have limited knowledge about the predictors of who is at greater risk of COVID-19 infection. Many models have been published to predict the number of people infected by COVID-19,Citation3Citation5 but not who is more likely to be infected. Other research by the US CDC identified who would be more likely to develop severe symptoms once they contract the COVID infection.Citation6 This information has already prompted governments and NGOs to take preventive measures tailored to those identified groups of people, such as older people and people with chronic disease.Citation7,Citation8 New evidence has emerged that homeless people and people in care homes are at greater risk of contracting COVID-19.Citation8,Citation9 Had this information been known earlier, more lives could have been saved by more targeted preventive measures.Citation10 Therefore, healthcare services want and need information on the risk predictors of people who are at greater risk of contracting COVID-19.

The purpose of the study is to further our knowledge of the risk factors to allow early identification of individuals more susceptible to COVID-19 infection. These predictors can help target infectious disease prevention and control towards higher risk groups. This knowledge is especially critical to countries experiencing a shortage of test kits. Where physical testing is facing greater shortages, identification of risk via informatics becomes more critical. To identify such predictors, we conducted this study in Iran, a hotspot of COVID-19 with a shortage of test kits.Citation11 We predict individuals’ likelihood of COVID-19 infection based on: (1) demographic variables, including chronic medical conditions and exercise hours;Citation12,Citation13 (2) a set of employment status and work situation variables which can affect individuals’ daily routines and movements and hence the risk of contracting the disease;Citation13,Citation14 and (3) a pair of psychiatric variables on depression and anxiety, due to their effect on how people cope with adversity.Citation15,Citation16 The information on the predictors opens a new research avenue to identify individuals at greater risk of contracting COVID-19 to manage the pandemic with a shortage of test kits.

Methods

COVID-19 hit Iran early and hard, and Iran has been one of the countries most affected by COVID-19 since March 2020. Healthcare modeling in late March, when we designed the study, estimated the COVID-19 crisis in Iran would reach its peak in the first week of April. Accordingly, we surveyed adults in Iran on April 1–10, 2020. On April 1, official statistics reported 47,593 confirmed cases and 3036 deaths with COVID-19. On April 10, official statistics reported 68,192 confirmed cases and 4232 deaths with COVID-19. Overall, 0.08% of Iranians were positive for COVID-19 on April 10.

Survey participants reported their individual COVID-19 infection status as negative; unsure; or positive. Participants also reported whether they had chronic health issues (no; unsure; yes), exercise hours per day in the past week, working situation (worked from home; worked in workplace; stopped work due to COVID-19; unemployed), and Patient Health Questionnaire 4-item (PHQ-4) scale which measures depression and anxiety, as well as their demographic variables such as their gender, age, and the size of their work organization (0 for the unemployed), because large organizations were deemed to offer better healthcare coverage for their employees in Iran. The survey items can be found in Appendix A.

Participation was entirely voluntary and anonymous, and we distributed the survey through social media (Telegram, Instagram and WhatsApp) given the lockdown. One of the authors of this paper is from the Faculty of Sport Sciences in Shahid Rejaee University and another author is from the Faculty of Sport Sciences, Alzahra University has applied for ethics approval from the National Sports Science Research Institute in Iran. The survey was approved (IR.SSRI.REC.1389.685) by the ethics committee of the Sports Science Research Institute in Iran and its implementation was based on the standard of the Checklist for Reporting Results of Internet E-Surveys (CHERRIES) (Appendix B). In the cover page of the online survey, informed consent was obtained electronically from each participant.

Statistical Analysis

We analyzed the data using STATA 16.0 with a significance level at 95%. Because the outcome variable of individual COVID-19 infection is ordinal (negative; unsure; positive), we predicted it by ordered logistic regressions using the STATA command of gologit2. Accordingly, the predictions of individual COVID-19 infection are in odds ratios (ORs).

Results

Descriptive Findings

contains the descriptive findings. Of the 521 adults who completed the survey, about half were female (51%). The average age was 43.9 years (st.d. 11.7; min: 20; max: 79). At the time of the survey, 44% of the adults worked from home; 26% still went to work in their workplaces; 27% had stopped working due to COVID-19; and 3% were unemployed. The median number of employees in a workplace was 28 (mean: 201.5; st.d. 456.8). Most participants (87%) did not have chronic medical issues; 3% were unsure whether they had chronic medical issues, and the remaining 10% had chronic medical issues. In terms of exercise hours per day in the past week, 56%, 37%, 4%, 1%, and 2% of participants exercised 0, 1, 2, 3, and 4 or more hours per day, respectively. The mean scores on the PHQ-4 for depression and anxiety were 1.7 (st.d. 1.4) and 1.6 (st.d. 1.5) respectively, meaning 22.3% and 21.5% surpassed the cutoff levels of psychiatric screening for depression and anxiety disorders, respectively. In terms of COVID-19, 82% of the participants indicated they did not have COVID-19, 15% were unsure, and 3% reported they were infected by COVID-19.

Table 1 Demographic Characteristics and COVID-19 Status of the Participants (n=521)

Risk Predictors of Individual COVID-19 Infection Status

shows the ordered logistic regressions analysis predicting the likelihood of being COVID-19 negative from the alternatives (ie, being unsure or positive). Adults with chronic medical issues were 48% more likely to be COVID-19 negative (OR: 1.48; 95% CI: 1.06 to 2.08; p = 0.023), possibly due to them being more cautious, suggesting people had taken seriously the information on the higher fatality rate of people who had comorbidities. Adults who exercised more hours per day were more likely to be COVID-19 negative (OR: 1.78; 95% CI: 1.21 to 2.62; p = 0.003). Each additional hour of exercise per day predicted a 78% increase in the likelihood of being COVID-19 negative.

Table 2 Ordered Logistic Regression Results Predicting Individuals’ Likelihood of Being COVID-19 Positive or Negative (n=521)

As we captured four work situations (worked from home, worked at workplace, stopped work, and unemployed), we introduced each work situation into the regression as a reference group one by one to conduct a pairwise comparison. Compared with those who worked from home, those who worked at their workplace or had stopped work were, respectively, 69% (OR: 0.31; 95% CI: 0.17 to 0.56; p = 0.000) and 54% (OR: 0.46; 95% CI: 0.25 to 0.84; p = 0.012) less likely to be COVID-19 negative. In other words, those who worked from home were more likely to be COVID-19 negative than those who went to work at their workplace or had stopped working.

We further performed ordered logistic regression analysis to predict the likelihood of being COVID-19 positive from the alternatives. As expected, depression was positively associated with the likelihood of being COVID-19 positive (OR: 6.51; 95% CI: 2.16 to 19.65; p = 0.001), but the association does not imply causality. The pairwise comparison by work situation revealed that the likelihood of being COVID-19 positive among those who had stopped working was 31.15 times those who worked from home (OR: 31.15; 95% CI: 1.30 to 743.91; p = 0.034) and 65.79 times those who were unemployed (OR: 65.79; 95% CI: 1.41 to 3069.98; p = 0.033). The p-values were significant but the confidence intervals were large due to the small number of participants who reported being COVID-19 positive. The results imply that those who had stopped work had a higher infection rate, perhaps either because they were agitated or restless now without work or had riskier jobs to begin with and had to stop working. The size of the work organization by number of employees negatively predicted the likelihood of being COVID-19 positive (OR: 0.99; 95% CI: 0.995 to 1.000; p = 0.025), suggesting those who worked in larger organizations were safer.

It is worth noting that the predictors of being COVID-19 positive differed from the predictors of being COVID-19 negative. Moreover, variables for age, gender, and anxiety did not predict individual COVID-19 infection status at the time of the survey.

Predicted Likelihood of COVID-19 Status by Work Situation

We also report the predicted likelihood of being COVID-19 negative, unsure, or positive by an individual’s work situation, holding the other variables constant (). Individuals who worked from home had an 89.5% (OR: 0.895; 95% CI: 0.856 to 0.933; p = 0.000) likelihood of being COVID-19 negative, 0.6% (OR: 0.006; 95% CI: 0.069 to 0.080; p = 0.878) likelihood of being unsure, and 9.9% (OR: 0.099; 95% CI: 0.025 to 0.173; p = 0.009) likelihood of being COVID-19 positive. Overall, those who worked from home were relatively aware of their COVID-19 infection status, no matter if it was positive or negative.

Figure 1 (A) Predicted likelihood of being COVID-19 negative. (B) Predicted likelihood of being COVID-19 positive.

Figure 1 (A) Predicted likelihood of being COVID-19 negative. (B) Predicted likelihood of being COVID-19 positive.

Individuals who worked at their workplace had a 73.4% (OR: 0.734; 95% CI: 0.659 to 0.809; p = 0.000) likelihood of being COVID-19 negative, 20.4% (OR: 0.204; 95% CI: 0.130 to 0.279; p = 0.000) likelihood of being unsure, and 6.1% (OR: 0.061; 95% CI: 0.022 to 0.101; p = 0.002) likelihood of being COVID-19 positive. Hence, over 20% of those who worked at their workplace were unsure of their COVID-19 infection status, suggesting this group of people were likely in a state of uncertainty.

Individuals who had stopped work had an 80.1% (OR: 0.801; 95% CI: 0.735 to 0.867; p = 0.000) likelihood of being COVID-19 negative, 18.6% (OR: 0.186; 95% CI: 0.120 to 0.252; p = 0.000) likelihood of being unsure, and 1.3% (OR: 0.013; 95% CI: 0.007 to 0.032; p = 0.196) likelihood of being COVID-19 positive. A significant proportion of these individuals were also unsure of their COVID-19 infection status.

Individuals who were unemployed had a 74.2% (OR: 0.742; 95% CI: 0.522 to 0.962; p = 0.000) likelihood of being COVID-19 negative, 11.0% (OR: 0.110; 95% CI: −0.129 to 0.348; p = 0.367) likelihood of being unsure, and 14.9% (OR: 0.149; 95% CI: −0.039 to 0.336; p = 0.121) likelihood of being COVID-19 positive.

Discussion

COVID-19 test kits have been in short supply since the beginning of the COVID-19 outbreak and continue to be in critical shortage in many countries as the pandemic continues to develop. Given the insufficient testing capacity, we identify a novel approach to predict the likelihood of COVID-19 infection by individual risk factors, as a supplementary approach to identify clusters of individuals with more or less risk of contracting the virus – a critical piece of information to enable more targeted social distancing and isolation practices to contain the virus infection, especially in areas with insufficient testing.

The empirical setting of Iran had a high population-wide COVID-19 infection rate of 0.08% in early April. In our sample of over 500 adults, 3% reported being COVID-19 positive and 15% were unsure of their status. These relatively high rates enable us to conduct the analysis.

First, the results on the predictors of being COVID-19 negative reveal that two groups were more likely to be COVID-19 negative: people who exercised more and people who had chronic medical issues. While it may appear counterintuitive that those who had chronic medical issues were more likely to be COVID-19 negative, the finding is understandable, as people with chronic medical issues likely went out less and likely had taken more action to protect themselves against the COVID-19 disease due to their higher chance of becoming seriously ill or dying if they did get the virus. The exercise finding may reflect that healthier people are more likely to be able to exercise. The finding that those who worked from home had a higher chance of being COVID-19 negative supports the shelter-in or stay-at-home orders in many parts of the world during the pandemic.

Second, the results on the predictor of being COVID-19 positive reveal, somewhat surprisingly, those who had stopped working had significantly higher chance of being COVID-19 positive than those who worked from home or were unemployed. Unlike those who worked from home, or those who were unemployed who were probably more used to not working, those who had their work suddenly stopped due to COVID-19 might be more agitated or restlessCitation17,Citation18 With their daily work taken away and more spare time, they might have had risk exposures elsewhere. It is also possible that those who had their work stopped may have had a riskier job to begin with and therefore had a lower chance of being COVID-19 negative due to their previous exposures at work. Nonetheless, it highlights that we should not assume those who have stopped working are safe. They remain at higher risk than the groups who worked from home or had not been employed for a long time (before and during the COVID-19 outbreak). People working in smaller organizations were at greater risk of being COVID-19 positive, suggesting epidemiological preventions could target employees of smaller organizations more.

Lastly, past reports have indicated older people and males were more likely to have COVID-19.Citation19,Citation20 Age and gender have been found to be useful predictors of the mental health of adults during the COVID-19 crisis,Citation16,Citation19 however, they failed to directly predict either COVID-19 negative or COVID-19 positive status in our analysis.

These findings identified a number of risk factors that could enable more targeted epidemiological preventions. The risk factors can help to identify people to prioritize for COVID-19 testing, should testing kits become more available, or in lieu, help implement more targeted social distancing and isolation measures, or conduct more specific communications on infectious disease prevention and control to high-risk groups.

This study has several limitations. While the severity of the COVID-19 crisis in Iran early on presents a setting to predict COVID-19 infection status, the number of COVID-19 positive cases in our sample remained relatively small. Similarly, the number of unemployed participants in the sample was too small to enable more analysis. While we aimed to cover a broad spectrum of adults in Iran, our sample should not be taken as a representative national sample. Also, the risk factors of COVID-19 infection are likely to differ across countries given different cultural and social practices. As the coronavirus remains relatively new, we do not know if there is any stigma around it that people with the virus may choose to underreport even in an anonymous survey or how the model may work in other countries that face varying difficulties in dealing with the virus.Citation21 Lastly, our model is predictive, and we do not claim causalities.

In summary, the smart use of data and information is key to successful responses to the COVID-19 crisis. Information has enabled the development of various predictive models on daily and total cases, the risk factors of severe illness and death, and fatality rates. This study provides the first attempt to identify individual information as risk factors of COVID-19 infection during the COVID-19 pandemic. We hope this research opens a new avenue of health informatics to identify relevant information as individual risk factors to enable more targeted infectious disease prevention, communication, testing, and control to curtail the pandemic and to complement the effort to expand testing capacity.

Transparency Declaration

The lead author Stephen X. Zhang affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

Dissemination Declaration

We plan to disseminate the results to study participants.

Data Sharing Statement

No additional data are available.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Disclosure

All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organization for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Additional information

Funding

This research was funded by the MOE Project of Key Research Institute of Humanities and Social Sciences at Universities (16JJD630005).

References