Full article: Random forest models of food safety behavior during the COVID-19 pandemic

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Machine learning approaches are increasingly being adopted as data analysis tools in scientific behavioral predictions. This paper utilizes a machine learning approach, Random Forest Model, to determine the top prediction variables of food safety behavioral changes during the pandemic. Data was collected among U.S. consumers on risk perception of COVID-19 and foodborne illness (FBI), food safety practice behaviors and demographics through online surveys at ten different time points from April 2020 through to May 2021; and post pandemic in May 2022. Random forest model was used to predict 14 food safety-related behaviors. The models for predicting Handwashing before cooking and Handwashing after eating had a good performance, with F-1 score of 0.93 and 0.88, respectively. Attitudes- related variables were determined to be important in predicting food safety behaviors. The importance ranking of the predicting variables were found to be changing over time.

KEYWORDS:

Introduction

Improper food handling in the home kitchen environment puts consumers at risk of foodborne illness (FBI). Research has shown that since the COVID-19 pandemic, some consumers have started to eat more often in their home kitchens (Pan and Rizov Citation2022), which may potentially be a lasting change. Not only have the eating habits changed, but also consumer’s food handling and purchasing practices have changed due to the COVID-19 pandemic (Altarrah et al. Citation2021; Thomas and Feng Citation2021b, Citation2021b). One common way that researchers explore food safety behaviors is to use behavior change models. Behavior change models, such as the Theory of Planned Behavior (Ajzen Citation1991), are theoretical frameworks used to understand human behavior. These models utilize socio-cognitive variables, such as attitudes toward the behaviors, subjective norms, and perceived barrier controls to explain how intended behavior arises. These variables have often been interpreted in research to reflect variables such as knowledge, risk perception, and attitudes that are associated with food safety behavior change (Young et al. Citation2015, Citation2019).

Some studies have assessed behavioral changes and predictors of food handling practices during the pandemic using traditional statistical tools or behavioral change models. Notable among these are the studies by Liu et al. (Citation2021) and Mucinhato et al. (Citation2022) which reported that attitude significantly predicts the intent of safe food handling among consumers during COVID-19, utilizing Structural Equation Modeling and an extended Theory of Planned Behavior, respectively. Using Protection Motivation Theory, Soon et al. (Citation2022) also reported perceived severity to have a significant impact on hygienic practices such as handwashing after shopping. Participants in the study who perceived COVID-19 as a serious threat were more likely to wash their hands. However, these studies were predominantly conducted outside the United States, with limited investigation into the predictors for changing food safety behaviors during the COVID-19 pandemic within the U.S. Despite varying beliefs about the virus, including perceptions of it being a hoax, a behavioral impact on consumer practices was observed (Imhoff and Lamberty Citation2020). Investigating these behaviors is crucial for developing effective risk communication strategies.

Behavior change models are not the only way to explore food safety behaviors. Artificial intelligence is another way to explore food safety behaviors, especially machine learning algorithms (Kudashkina et al. Citation2022). Machine learning algorithms that employ predictive modeling for data mining of large datasets can be used to further explore food safety behavior to predict possible outcomes. This approach is somewhat similar to behavior change models in its ability to identify relationships and insights into modeled outcomes (Johnson and Wichern Citation2007). For instance, the random forest model, a type of predictive modeling used in data mining, employs variable values to group datasets into similar categories (Breiman et al. Citation2017). Random Forest has been applied in various studies to explore changes in physical activity (Mooney et al. Citation2017), health behaviors (Engchuan et al. Citation2019), and food safety compliance rates (Oldroyd et al. Citation2021). Compared to behavior change model, predictive modeling, like random forest models, can offer a more flexible and non-parametric means of investigating food safety behaviors (VanderWeele Citation2019). In addition, random forest models have been noted to be capable of handling different variables and modeling complex nonlinear relationships (Cheng et al. Citation2019) which can sometimes be a challenge with traditional behavioral frameworks. However, unlike behavior change models, which imply causality, predictive modeling methods are more exploratory. Nevertheless, they can uncover key independent variables influencing food safety behavior changes.

Identifying predictors of change or variables that significantly impact consumer food safety behavior is crucial for stakeholders planning risk communication for public health events like pandemics. As at the time of writing this paper, no study has utilized machine learning algorithms to identify predictors of food safety behaviors changes during the pandemic. This paper therefore sort to address this gap by using machine learning algorithm to explore food safety behaviors. The main objective of this study was to utilize random forest models of consumer food safety behaviors built using U.S consumer survey data during the COVID-19 pandemic to identify the top predicting variables of food safety behaviors during the pandemic and further explore how they change over time. The information derived from this study will give a better understanding of consumer behavior during the pandemic and highlight areas to improve consumer education as well as future research. These findings will assist regulatory and other related agencies in developing effective food safety communication strategies and establish a precedence for incorporating machine learning models into the prediction of food safety behaviors alongside traditional behavioral frameworks.

Methods

Data collection

An online survey was developed to collect data on U.S consumer risk perception of COVID-19 and foodborne illness (FBI), practices of food safety behaviors, and demographic characteristics. A total of 44 questions were developed for this survey. The surveys were administered through Qualtrics XM (Qualtrics, Provo, UT) at ten time points from April 2020 to October 2020, January 2021, March 2021 and May 2021 to the same population of participants. Another round of survey was administered in May 2022 to collect a post-pandemic time point. This resulted in each time point having at least 700 data points for a total of 7355 data points. The protocols for the survey are reported in previous studies done by the authors, Thomas and Feng (Citation2021b, Citation2021a).

Data handling

Data was exported from Qualtrics XM (Qualtrics, Provo, UT) into IBM Statistical Package for Social Sciences v.27 (SPSS) (IBM, Armok, NY). For data preparation before modeling, several modifications and exploratory analyses were conducted on the raw data, including (1) creating composite variables to reduce dimensions; (2) encoding categorical variables as multiple binary variables; (3) rescaling variable values from 0 to 1; and (4) removing highly correlated variable and non-pertinent variables (Hussein et al. Citation2021). These processes improve the likelihood of achieving a model with high performance. Supplementary Table S1 gives more details on how data variables were modified and handled.

Model selection and food safety behaviors modeled

A random forest is an ensemble, or group, of hundreds of individual classifier and regression tree (CART) models that averages the findings of the individual CART models to provide a final result (Fawagreh et al. Citation2014). These CART models use the values of each predicting variable to segregate the data set into different groups based on the value of the predicting variable, with the objective of making groups with similar predicted variable values. This objective is achieved by using the decreasing in impurity, which also corresponds to how important each predicting variable is. There are two measures of impurity depending on a classifier or regression tree (CART); these being Gini impurity for binary variables (Zhang and Yao Citation2017) and least squares deviation metric for continuous variables (IBM Citation2020). These values are compared in the groups created before and after splitting by a predicting variable’s value to calculate the decrease. A randomly selected subset of all predicting variables is used to build each tree in a random forest, by iteratively using the predicting variable with the highest decrease in impurity.

Models were created using IBM SPSS Modeler (IBM, Armonk, NY) to predict fourteen food safety-related behaviors from the survey. The food safety-related behaviors were categorized into two, one of the categories had two binary variables and the other had twelve continuous variables. These were modeled resulting in fourteen random forest models for each of the ten data collection time points.

Predicting variables in the model were selected based on a statistical test that was adapted from previous studies which used importance ranking (IR) (Nguyen et al. Citation2015; Lee et al. Citation2023). IR is the ordinal ranking of variable importance (Wei et al. Citation2015) associated with each variable. Variable importance is defined as the strength in the dependence between the independent variables and the dependent variable and is measured in Random Forests by averaging the decrease in impurity across all the trees for each predicting variable (i.e. the mean decrease in impurity) (Breiman et al. Citation2017). Statistically significant variables in the test were identified as the predicting variable for each model.

To identify the top predicting variables for each food safety behavior at each time point, the top fifteen predicting variables with the largest mean decrease in impurity in each model iteration were ordinally ranked to create an IR. This IR was interpreted as lower ranking variables being important in the model. The median IR for each variable was taken across the model iterations to create a single IR per food safety behavior and time point. The created median IR was then used as the IR for subsequent analyses.

Model creation and evaluations

After predicting variable determination, random forest models were built. The datasets were segregated into training, and testing datasets, with a 70% and 30% split adapted from previous studies (Hastie et al. Citation2009; Kuhn and Johnson Citation2019). The training dataset is the data used to create the models, while the testing dataset is the data used to measure the performance of the created models. Models were iterated three times, randomizing which data points were assigned to the training and testing dataset, as the models may change slightly based on what data from the dataset is used to build the model. When creating the models, binary food safety behavior variables were balanced using the synthetic minority oversampling technique (SMOTE) (Chawla et al. Citation2002). Then, classifier forests were built to optimize parameters in the model using a rbfOpt optimization function (Diaz et al. Citation2017). Meanwhile, continuous food safety behavior variables were used in regression forests with P * 10 trees, where P is the number of predicting variables (Boehmke and Greenwell Citation2020).

Models for binary food safety behavior variables were evaluated for performance using the metrics of F1-Score (Hand et al. Citation2021).

F 1 = \frac{2 \times TP}{FN + FP + 2 \times TP}

Where TP is the number of true positives (correct classification for values with 1), FN is the number of false negatives (classifications of 0, when the actual value is 1), FP is the number of false positives (classifications of 1, when the actual value is 0) when using data from the testing dataset. The F1-Score was averaged across model iterations in Excel (Microsoft, Redmond, WA) to create a mean F1-Score. A two-sided T-test was conducted to test if the measured mean F1-Score from the models were statistically different from 1 using a significance of p = 0.05.

Models for continuous food safety behavior variables were evaluated for performance using root mean square error (RMSE). These metrics were averaged across model iterations in Excel to create a mean RMSE. A one-tailed T-test was conducted to test if the measured mean RMSE from the models were statistically different from 0 using a significance of p = 0.05.

Exploration of predicting variables

To simplify analysis, top ranking predicting variables were grouped into conceptual clusters based on the type of construct being measured. These were attitude-related (such as risk perceptions, concerns), demographics-related (such as age, gender) and information-related variables (such as sources of information) points (). The IRs of predicting variables were extracted from each model and were compared. Further analysis in the form of a Kendall’s tau ranking correlation was conducted on the IRs using SPSS to assess how the ranking order of IRs changed between time points (Allen Citation2017). Bar plots were constructed of IRs at each time point to visualize the IRs for each predicting variable on selected food safety behaviors.

Table 1. Top predicting variables based on importance ranking clustered into 3 main areas.

Download CSV Display Table

Investigating IRs of predicting variables change over time

To further explore why the IR of the predicting variables were changing over time, analysis was conducted on the IRs of predicting variables that were found to be most important. Researchers explored the possible relationship between COVID-19 cases and the change in patterns on the IRs of predicting variables. Two composite variables were created for predicting variables related to COVID-19 and food safety. These were attitudes towards COVID-19 (A_C19) which consisted of (1) risk perception of getting COVID-19 from others, (2) risk perception of getting COVID-19 from food, and (3) belief that handwashing protects against COVID-19; and attitudes towards food safety (A_FBI) which consisted of (1) overall concern for food safety, (2) self-confidence in food safety practices, and (3) belief that handwashing protects against FBI. The median IR between the sets of the three variables for each group was used as the value for the created variables.

At each time point and food safety behavior, the difference between the value of attitudes related to FBI and those related to COVID-19 was calculated to create a difference in IRs (D_A) A_FBI - A_{C19 =} D_A. The median D_A was calculated across the food safety behaviors for each time point to create a single D_A at each time point.

Created median D_A values were compared to the change in U.S COVID-19 case numbers from the previous month. The total number of US COVID-19 cases from the WHO website was extracted from March 2020 to May 2022 (WHO Citation2022). Then, the total number of cases each month was summed to create the total number of US COVID-19 cases in a month. Each sum was then subtracted by the sum corresponding to the next month in chronological order creating the change in the case number (D_case). The D_case was correlated with median D_A using a spearman’s correlation test with a statistical significance of p = 0.05 (Chen and Popovich Citation2002).

Results

The current study built 420 random forest models (10 time points multiplied by fourteen food safety behaviors, multiplied by three iterations) to identify top predicting variables and how they may have changed over the ten data collecting time points. Depending on the food safety behavior, 22 to 49 predicting variables were identified for each model. For ease of interpretation, the predicting variables could be combined into one of nine categories (Suplementary Table S2).

SupplementaryTable S3 shows the study participants’ demographics. Each month had a response of 700+ participants, with majority of the participants being White (non-Hispanic).

Performance evaluation of models

The F1-scores are close to 1 with a minimum value of 0.83 (), which indicates that the models created for the binary variables had a good performance. All the models at each time point, with the exception of April 20, had an F-1 score statistically less than 1 (p < 0.05). This indicates that even though the models were not perfect, they performed well. Comparing the different models, the models predicting if “consumers wash their hands before cooking” had a mean F1-Score of 0.93, and the models predicting if “consumers wash their hands before eating” had a mean F1-Score of 0.88, which suggests that the models predicting if “consumers wash their hands before cooking” had the better performance.

Table 2. Average F1-score of models for each time point and binary food safety behavior variables.

Download CSV Display Table

All RMSE values were statistically greater than 0 (p < 0.05), suggesting that the models created in the current study were not perfect models (). However, all RMSE values were less than 0.27 (close to 0), which indicates the models created in the present study showed good performance. Some models performed better when predicting certain food safety behaviors. As seen in , the models predicting “consumer frequency of handwashing during the COVID-19 pandemic” (FSB 2) and the models predicting “consumer frequency of sanitizing kitchen surfaces during the COVID-19 pandemic” (FSB 12) had the most accurate models on average with a mean RMSE of 0.17 across time points. The least accurate models were for predicting “consumer frequency of meat thermometer usage before the COVID-19 pandemic”, with a mean RMSE of 0.22 between time points. In this paper, subsequent results will be reported on only the models that had better performance, that is the models predicting if ‘consumers “wash their hands before cooking”, “wash their hands before eating”, “frequency of handwashing during the COVID-19 pandemic” (FSB 2) and “frequency of sanitizing kitchen surfaces during the COVID-19 pandemic” (FSB 12).

Table 3. Average mean RMSE of models for each time point and continuous food safety behavior variables.

Download CSV Display Table

IRs of predicting variables

The IRs of predicting variables were extracted from each model and were compared. The top fifteen predicting variables in each model were further clustered into three major categories. These were attitudes-related predicting variables, demographic-related predicting variables, and information-related predicting variables. shows the specific variables for each category.

Among the three categories, the top predicting variables was attitude-related predicting variables (), which were mainly ranked 6th or lower in importance among the different food safety behaviors, except in May 20, August 2020, and January 2021 of the food safety behavior, “Washing hands before cooking”. This suggests that attitude-related prediction variables on COVID-19 and food safety were substantially important when predicting food safety behaviors during the COVID-19 pandemic. Demographic-related prediction variables and information-related prediction variables often overlapped with each other. However, demographic-related prediction variables were frequently ranked as more important than the information-related prediction variables.

Figure 1. The median ranks of attitudes-related predicting variables, demographic-related predicting variables, and information-related predicting variables for each time point and food safety behavior. The Y-axis is the median IR, and the X-axis is the time point subdivided by food safety behavior. Blue bars correspond to the median IR of attitudes-related predicting variables; orange bars correspond to the median IR of demographic-related predicting variables; grey bars correspond to information-related predicting variables.

Correlation in the order of IRs

The correlations of the IRs between time points varied heavily in all models for each food safety behavior variable, which suggests that the relationship between prediction variables can change when predicting food safety behaviors over time. Most had strong correlations except for models predicting if a consumer “washes their hands before cooking”. The IR correlations for models of the food safety behavior with the lowest mean RMSE across time points, which are models predicting the “frequency of handwashing with soap and water during the COVID-19 pandemic”, are displayed in . Tables showing correlations between time points for the other food safety behaviors can be found in Supplementary Tables S4-S6.

Table 4. Kendall’s Τ rank correlations between time points for models predicting consumer’s frequency of hand washing with soap and water during the COVID-19 pandemic.

Download CSV Display Table

In the table below, the models predicting a consumer’s frequency of handwashing with soap and water during the COVID-19 pandemic indicated strong correlations (τ >0.3) (Allen Citation2017). Despite this, some time points had significantly larger correlations than others, with some recording a high τ of 0.722 (p < 0.005), and others a low τ of 0.333 (p < 0.005). The changes in IR correlation between time points indicate that the relationship between factors and food safety behaviors were changing between timepoints.

Attitudes-related predicting variables: COVID-19 versus FBI

The median D_A and D_case values for each time point show a trend where both values are positive and negative (). The months of May 20, August 202020 March 2021, and May 2021 all had negative values for both median D_A and D_case. This indicates that at these months, there was an observed decrease in new COVID-19 cases; and attitudes related to foodborne illness were more important than attitudes related to COVID-19. The analysis for this trend using spearman’s correlation yielded a moderate correlation (r_s = 0.66, p < 0.05) between the median D_A and D_case. As the monthly change in COVID-19 cases increases, the median IRs of attitude-related prediction variables to COVID-19 (A_C19) grew smaller than the median IRs of attitude-related prediction variables to food safety (A_FBI) and vice versa. This suggests that when the monthly change in COVID-19 increases, attitude-related to COVID-19 are more important than attitude-related to FBI when predicting food safety behaviors, and vice versa.

Table 5. The Dcase and Median DA at each timepoint.

Download CSV Display Table

This correlation suggests a relationship between Dcase and when attitudes related to food safety is at a lower IR than attitudes related to COVID-19. This relationship suggests that Dcase is related to an external effect that may cause AC19 to be more important than AFBI at different time points. This suggests that US COVID-19 cases may influence which factors drove food safety behaviors during the pandemic, as a change in cases correlates with months where attitudes related to COVID-19 were more important than attitudes related to FBI and vice versa.

Discussion

This study was the first to employ random forest models to predict fourteen food safety behaviors at ten different time points throughout the COVID-19 pandemic.

Importance of attitude-related variables

The findings from this study indicate attitude-related variables as more important in predicting food safety behaviors than other variables during the pandemic. Some literature has supported the importance of attitudes with regards to food safety practices. One of such study is by Mucinhato et al. (Citation2022) who identified the individual’s attitude as one variable that significantly impacted consumers’ food safety behavior. Another study by Lin and Roberts (Citation2020) highlighted individual’s attitude as one of the important behavioral constructs in the prediction of food safety intent. Several studies and models of behavior change identified that change in attitudes precedes positive changes in food safety behaviors (Ajzen Citation1991; Rimal and Real Citation2003; Kwol et al. Citation2020). However, Young et al. (Citation2016), did point out that consumers are generally not concerned about food safety, which is a reflection of their attitude towards food safety. As such, attitudes being important in impacting food safety behaviors are a prime area needing improvement for behavioral change. It is recommended for educators and stakeholders to embark on food safety programs that center on attitude change.

Findings further indicate that COVID-19 related attitudes were important alongside food safety-related attitudes. This follows existing studies that highlighted how COVID-19 affected various food safety behaviours and values. Preventative behaviors such as handwashing increased during the COVID-19 pandemic (Thomas and Feng Citation2021b), and it might be contributed by COVID-19 risk perceptions. Mucinhato et al. (Citation2022) identified a positive association between COVID-19 risk perception and household food safety practices. Meixner and Katt (Citation2020) found that during the COVID-19 pandemic, the importance of food safety attributes increased substantially to consumers when buying food, despite COVID-19 being a foodborne disease (Duda-Chodak et al. Citation2020). As such, more research is needed into the mechanistic overlap in human behaviour between COVID-19 or other illnesses risk perception and foodborne illness, which could reveal possibly new novel approaches for educational programs and policy between the FDA and CDC, as this area remains relatively unexplored.

The findings from this current study differ from another study using random forest that found injunctive norms, not attitude-related variables, as the top prediction variables for behaviors such as handwashing (Van Lissa et al. Citation2022). However, the current study did not consider injunctive norms; as this was not a measure in the dataset used. As such, a future study may compare these top prediction variables using a combined dataset to compare the findings.

Demographic-related variables were also identified as important in determining food safety behaviors. This finding further qualifies Thomas and Feng (Citation2021a) findings that gender and income impact behaviors, albeit minor, compared to other variables. Tailoring future educational efforts to the intended demographic might be one approach to effect food safety behavioral changes.

Monthly change in COVID-19 cases and food safety-related behaviors

An increase in the monthly change in COVID-19 cases was correlated with attitudes related to COVID-19 than attitudes related to FBI when predicting food safety-related behaviors. While this correlation is not causal, it is informative of a potential framework to explore; and that is, how effects relayed to COVID-19 mediated food safety behaviors. This finding is novel and more research is needed to confirm a causal relationship as no study has investigated how case numbers related to risk perception of COVID-19 and foodborne illness.

It is unclear the exact mechanism of how the number of COVID-19 cases might be affecting the overall behavioral framework, however there is one potential mechanism that may partially explain the pattern, which is direct and indirect exposure to COVID-19 (Litwin et al. Citation2021). Several studies have shown that indirect and direct exposure leads to a higher risk perception of COVID-19 and an increase in protective behaviors that may be related to both COVID-19 and FBI, such as handwashing and kitchen cleaning (Dryhurst et al. Citation2020; Schneider et al. Citation2021). This relationship between exposure and risk perception may impact the models where attitudes related to COVID-19 become more important in determining behaviors. However, this does not explain why attitudes related to FBI are more important than attitudes related to COVID-19 in predicting food safety behaviors when the monthly change in COVID-19 decreases. As such, more research may be needed to investigate these potential correlations and the possible mechanisms behind them. The investigation with this mechanism might lead to new insights for risk communication and food safety behavior change, which may inform future policy.

COVID-19 had specific political challenges associated with risk communication not commonly associated with food safety. For example, it was commonly believed my many Americans that COVID-19 was a hoax and this had a drastic negative impact on the effectiveness of communication (Tanase et al. Citation2022). Via the mechanism of indirect and direct exposure, the correlation between and increase in COVID-19 and an increase in the importance of COVID-19 risk perception for determining food safety behavior suggest that through the illnesses of consumers and their loved ones was the biggest motivation for behaviour change. This suggests a possible failure in current policy for consumer risk communication as it took the deaths and illnesses of others before consumers were convinced. However, further research is needed to confirm this claim, regardless certain efforts are needed to prevent risk messaging from becoming politicized.

Limitation

Despite a careful design process, the current study still has some limitations. Due to software limitations when identifying top prediction variables, IRs were used as the primary source of data instead of the actual measure of variable importance, which was unavailable. By conducting ordinal ranking, information about the distribution of the variable importance is lost between models. As such, non-parametric methods were adapted when analyzing the IR. While SPSS modeler is convenient for the ability to use predicting modeling without understanding code, future studies may consider alternative tools that may enable more robust analyses. Lastly, due to data unavailability and program limitations, we were unable to evaluate important variable interactions and mediating variables for food safety behaviors, such as food safety knowledge. However, many variables included can capture the relationships with knowledge and random forest’s methodology is capable of capturing most variable interactions (Breiman et al. Citation2017)

Conclusion

Our study investigates the top predicting variables of food safety behavioral change during the COVID-19 pandemic, utilizing Random Forest Model, a machine learning algorithm to analyze data collected from a longitudinal survey. The model’s top prediction variables suggest that attitudes toward food safety and COVID-19 are important for predicting food safety behaviors. This finding not only aligns with current findings in current research, but also uniquely confirms the significance of both types of beliefs. Future research should explore the overlap between beliefs about COVID-19 and foodborne illness, and their impact on food safety behavior potentially leading to innovative food safety education methods. The analysis also revealed a novel finding that an increase in the monthly change in COVID-19 cases was correlated with attitudes related to COVID-19 than attitudes related to FBI when predicting food safety-related behaviors, which may be due to indirect and direct exposure; suggesting exposure to the illness are quite effective at causing people’s beliefs and behaviour to change. This points to a shortfall in COVID-19 risk communication, likely exacerbated by politicization. Future policies must aim to prevent the politicization of risk communication. Another policy recommendation is to strengthen public health education by investing in public health education campaigns that emphasize the importance of understanding and reacting appropriately to health risks. These campaigns should focus on building trust in health authorities and understanding the implications of diseases like COVID-19 on food safety and personal health. Understanding consumers’ perceptions and behavior changes during health crises like the COVID-19 pandemic will better enhance the development of relevant food safety information resources.

Supplemental material

Supplemental Material

Download MS Word (84.7 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/09603123.2024.2354441

Additional information

Funding

This study was partially supported by the National Institute of Food and Agriculture, U.S. Department of Agriculture (2021-70020-35663 and 2020-68012-31822) and Hatch project 1016049.

References

Ajzen I. 1991. The theory of planned behavior. Organ Behav Hum Decis Process. 50(2):179–211. doi: 10.1016/0749-5978(91)90020-T.
Web of Science ®Google Scholar
Allen M. 2017. The sage encyclopedia of communication research methods. SAGE Publications, Inc.; [accessed 2023 November 10]. 10.4135/9781483381411.
Google Scholar
Altarrah D, Alshami E, Alhamad N, Albesher F, Devarajan S. 2021. The impact of coronavirus COVID-19 pandemic on food purchasing, eating behavior, and perception of food safety in Kuwait. Sustainability. 13(16):8987. doi: 10.3390/su13168987.
Web of Science ®Google Scholar
Boehmke B, Greenwell B. 2020. The R series hands-on machine learning with R. In CRC Press. [accessed 2023 November 10]. https://www.routledge.com/Hands-On-Machine-Learning-with-R/Boehmke-Greenwell/p/book/9781138495685.
Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ. 2017. Classification and regression trees. 1st ed.Routledge. [accessed 2023 November 10]. 10.1201/9781315139470.
Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. 2002. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 16:321–357. doi: 10.1613/jair.953.
Web of Science ®Google Scholar
Cheng L, Chen X, De Vos J, Lai X, Witlox F. 2019. Applying a random forest method approach to model travel mode choice behavior. Travel Behav Soc. 14:1–10. doi: 10.1016/j.tbs.2018.09.002.
Web of Science ®Google Scholar
Chen YP, Popovich MP. 2002. Correlation. SAGE Publications, Inc.; [accessed 2023 November 10]. 10.4135/9781412983808.
Google Scholar
Diaz GI, Fokoue-Nkoutche A, Nannicini G, Samulowitz H. 2017. An effective algorithm for hyperparameter optimization of neural networks. IBM J Res Dev. 61(4/5):9–1. doi: 10.1147/JRD.2017.2709578.
Web of Science ®Google Scholar
Dryhurst S, Schneider CR, Kerr J, Freeman ALJ, Recchia G, van der Bles AM, Spiegelhalter D, van der Linden S. 2020. Risk perceptions of COVID-19 around the world. J Risk Res. 23(7–8):994–1006. doi: 10.1080/13669877.2020.1758193.
Web of Science ®Google Scholar
Duda-Chodak A, Lukasiewicz M, Zięć G, Florkiewicz A, Filipiak-Florkiewicz A. 2020. Covid-19 pandemic and food: present knowledge, risks, consumers fears and safety. Trends Food Sci Technol. 105:145–160. doi: 10.1016/j.tifs.2020.08.020.
PubMed Web of Science ®Google Scholar
Engchuan W, Dimopoulos AC, Tyrovolas S, Caballero FF, Sanchez-Niubo A, Arndt H, Ayuso-Mateos JL, Haro JM, Chatterji S, Panagiotakos DB. 2019. Sociodemographic indicators of health status using a machine learning approach and data from the English longitudinal study of aging (ELSA). Med Sc Monit. 25:1994–2001. doi: 10.12659/MSM.913283.
PubMed Web of Science ®Google Scholar
Fawagreh K, Gaber MM, Elyan E. 2014. Random forests: from early developments to recent advancements. Syst Sc Control Eng. 2(1):602–609. doi: 10.1080/21642583.2014.956265.
Google Scholar
Hand DJ, Christen P, Kirielle N. 2021. F*: an interpretable transformation of the F-measure. Mach Learn. 110(3):451–456. doi: 10.1007/s10994-021-05964-1.
PubMed Web of Science ®Google Scholar
Hastie T, Tibshirani R, Friedman J. 2009. The elements of statistical learning: data mining, inference and prediction. 2nd ed. New York (NY): Springer.
Google Scholar
Hussein AY, Falcarin P, Sadiq AT. 2021. Enhancement performance of random forest algorithm via one hot encoding for IoT IDS. Period Eng Nat Sci (PEN). 9(3):579–591. doi: 10.21533/PEN.V9I3.2204.
Google Scholar
IBM. 2020. IBM SPSS Modeler (V18.2.2). Armonk, NY: IBM Corp.
Google Scholar
Imhoff R, Lamberty P. 2020 Nov. A bioweapon or a hoax? The link between distinct conspiracy beliefs about the coronavirus disease (COVID-19) outbreak and pandemic behavior. Soc Psychol Personal Sci. 11(8):1110–1118. doi: 10.1177/1948550620934692.
PubMed Web of Science ®Google Scholar
Johnson W, Wichern D. 2007. Applied multivariate statistical analysis. 6th ed. Upper Sadle River, NJ: Pearson.
Google Scholar
Kudashkina K, Corradini MG, Thirunathan P, Yada RY, Fraser ED. 2022. Artificial intelligence technology in food safety: a behavioral approach. Trends Food Sci Technol. 123:376–381. doi: 10.1016/j.tifs.2022.03.021.
Web of Science ®Google Scholar
Kuhn M, Johnson K. 2019. Applied predictive modeling. 1st ed. NY: Springer.
Google Scholar
Kwol VS, Eluwole KK, Avci T, Lasisi TT. 2020. Another look into the knowledge attitude practice (KAP) model for food control: an investigation of the mediating role of food handlers’ attitudes. Food Control. 110:107025. doi: 10.1016/j.foodcont.2019.107025.
Web of Science ®Google Scholar
Lee J, Shin SY, Briand L, Nejati S. 2023. Probabilistic WCET estimation for weakly hard real-time systems. arXiv preprint arXiv:2302.10288.
Google Scholar
Lin N, Roberts KR. 2020. Using the theory of planned behavior to predict food safety behavioral intention: a systematic review and meta-analysis. Int J Hosp. 90:102612. doi: 10.1016/j.ijhm.2020.102612.
Web of Science ®Google Scholar
Litwin H, Levinsky M, Stanley JT. 2021. Network-exposure severity and self-protective behaviors: the case of COVID-19. Innov Aging. 5(2):1–11. doi: 10.1093/geroni/igab015.
PubMed Web of Science ®Google Scholar
Liu Z, Mutukumira AN, Shen C, Dragan D. 2021. Food safety knowledge, attitudes, and eating behavior in the advent of the global coronavirus pandemic. PLOS ONE. 16(12):e0261832. doi: 10.1371/JOURNAL.PONE.0261832.
PubMed Web of Science ®Google Scholar
Meixner O, Katt F. 2020. Assessing the impact of COVID-19 on consumer food safety perceptions—A choice-based willingness to pay study. Sustainability. 12(18):7270. doi: 10.3390/su12187270.
Web of Science ®Google Scholar
Mooney SJ, Joshi S, Cerda M, Kennedy GJ, Beard JR, Rundle AG. 2017. Contextual correlates of physical activity among older adults: neighborhood environment-wide association study (ne-was). Cancer Epidemiol Biomarkers Prev. 26(4):495–504. doi: 10.1158/1055-9965.EPI-16-0827.
PubMed Web of Science ®Google Scholar
Mucinhato RMD, da Cunha DT, Barros SCF, Zanin LM, Auad LI, Weis GCC, Saccol de ALF, Stedefeldt E. 2022. Behavioral predictors of household food-safety practices during the COVID-19 pandemic: extending the theory of planned behavior. Food Control. 134:108719. doi: 10.1016/J.FOODCONT.2021.108719.
PubMed Web of Science ®Google Scholar
Nguyen TT, Huang JZ, Nguyen TT. 2015. Unbiased feature selection in learning random forests for high-dimensional data. Sci World J. 2015:1–18. doi: 10.1155/2015/471371.
Google Scholar
Oldroyd RA, Morris MA, Birkin M. 2021. Predicting food safety compliance for informed food outlet inspections: a machine learning approach. Int J Environ Res Public Health. 18(23):12635. doi: 10.3390/ijerph182312635.
PubMed Web of Science ®Google Scholar
Pan Y, Rizov M. 2022. Consumer behaviour in sourcing meals during COVID-19: implications for business and marketing. Sustainability. 14(21):13837. doi: 10.3390/su142113837.
Web of Science ®Google Scholar
Rimal RN, Real K. 2003. Perceived risk and efficacy beliefs as motivators of change use of the risk perception attitude (RPA) framework to understand health behaviors. Hum Commun Res. 29(3):370–399. doi: 10.1111/j.1468-2958.2003.tb00844.x.
Web of Science ®Google Scholar
Schneider CR, Dryhurst S, Kerr J, Freeman ALJ, Recchia G, Spiegelhalter D, van der Linden S. 2021. COVID-19 risk perception: a longitudinal analysis of its predictors and associations with health protective behaviours in the United Kingdom. J Risk Res. 24(3–4):294–313. doi: 10.1080/13669877.2021.1890637.
Web of Science ®Google Scholar
Soon JM, Vanany I, Wahab IRA, Sani NA, Hamdan RH, Jamaludin MH. 2022. Protection motivation theory and consumers’ food safety behaviour in response to COVID-19. Food Control. 138:109029. doi: 10.1016/j.foodcont.2022.109029.
PubMed Web of Science ®Google Scholar
Tanase LM, Kerr J, Freeman ALJ, Schneider CR. 2022 Aug 3. COVID-19 risk perception and hoax beliefs in the US immediately before and after the announcement of president Trump’s diagnosis. R Soc Open Sci. 9(8):212013. doi: 10.1098/rsos.212013.
PubMed Web of Science ®Google Scholar
Thomas MS, Feng Y. 2021a. Consumer risk perception and trusted sources of food safety information during the COVID-19 pandemic. Food Control. 130:108279. doi: 10.1016/J.FOODCONT.2021.108279.
PubMed Web of Science ®Google Scholar
Thomas MS, Feng Y. 2021b. Food handling practices in the era of COVID-19: a mixed-method longitudinal needs assessment of consumers in the United States. J Food Protection. 84(7):1176–1187. doi: 10.4315/JFP-21-006.
PubMed Web of Science ®Google Scholar
VanderWeele TJ. 2019. Principles of confounder selection. Eur J Epidemiol. 34(3):211–219. doi: 10.1007/s10654-019-00494-6.
PubMed Web of Science ®Google Scholar
Van Lissa CJ, Stroebe W, van Dellen MR, Leander NP, Agostini M, Draws T, Grygoryshyn A, Gützgow B, Kreienkamp J, Vetter CS, et al. 2022. Using machine learning to identify important predictors of COVID-19 infection prevention behaviors during the early phase of the pandemic. Patterns. 3(4):100482. doi: 10.1016/J.PATTER.2022.100482.
PubMedGoogle Scholar
Wei P, Lu Z, Song J. 2015. Variable importance analysis: a comprehensive review. Reliability Engineering & System Safety. 142(October):399–432. doi: 10.1016/J.RESS.2015.05.018.
Google Scholar
WHO: WHO coronavirus (COVID-19) dashboard. 2022. World Health Organization. [accessed 2023 Sep 26]. https://covid19.who.int/?adgroupsurvey={adgroupsurvey}&gclid=CjwKCAjw_MqgBhAGEiwAnYOAej8iEBfCeFjt4gpiW5Vy2jBipIkYCgqxDW36MgGvDEdgxB_-O6iR5hoCxlkQAvD_BwE.
Google Scholar
Young I, Greig J, Wilhelm BJ, Waddell LA. 2019. Effectiveness of food handler training and education interventions: a systematic review and meta-analysis. J Food Protection. 82(10):1714–1728. doi: 10.4315/0362-028X.JFP-19-108.
PubMed Web of Science ®Google Scholar
Young I, Waddell L, Harding S, Greig J, Mascarenhas M, Sivaramalingam B, Pham MT, Papadopoulos A. 2015. A systematic review and meta-analysis of the effectiveness of food safety education interventions for consumers in developed countries. BMC Public Health. 15(1):822. doi: 10.1186/s12889-015-2171-x.
PubMedGoogle Scholar
Young I, Waddell L, Nychas G-J. 2016. Barriers and facilitators to safe food handling among consumers: a systematic review and thematic synthesis of qualitative research studies. PLOS ONE. 11(12):167695. doi: 10.1371/JOURNAL.PONE.0167695.
Web of Science ®Google Scholar
Zhang Y, Yao JT. 2017. Gini objective functions for three-way classifications. Int J Approx Reason. 81:103–114. doi: 10.1016/j.ijar.2016.11.005.
Web of Science ®Google Scholar

Random forest models of food safety behavior during the COVID-19 pandemic

ABSTRACT

Introduction

Methods

Data collection

Data handling

Model selection and food safety behaviors modeled

Model creation and evaluations

Exploration of predicting variables

Table 1. Top predicting variables based on importance ranking clustered into 3 main areas.

Investigating IRs of predicting variables change over time

Results

Performance evaluation of models

Table 2. Average F1-score of models for each time point and binary food safety behavior variables.

Table 3. Average mean RMSE of models for each time point and continuous food safety behavior variables.

IRs of predicting variables

Correlation in the order of IRs

Table 4. Kendall’s Τ rank correlations between time points for models predicting consumer’s frequency of hand washing with soap and water during the COVID-19 pandemic.

Attitudes-related predicting variables: COVID-19 versus FBI

Table 5. The Dcase and Median DA at each timepoint.

Discussion

Importance of attitude-related variables

Monthly change in COVID-19 cases and food safety-related behaviors

Limitation

Conclusion

Supplemental Material

Disclosure statement

Supplementary material

References

Information for

Open access

Opportunities

Help and information

Random forest models of food safety behavior during the COVID-19 pandemic

ABSTRACT

Introduction

Methods

Data collection

Data handling

Model selection and food safety behaviors modeled

Model creation and evaluations

Exploration of predicting variables

Table 1. Top predicting variables based on importance ranking clustered into 3 main areas.

Investigating IRs of predicting variables change over time

Results

Performance evaluation of models

Table 2. Average F1-score of models for each time point and binary food safety behavior variables.

Table 3. Average mean RMSE of models for each time point and continuous food safety behavior variables.

IRs of predicting variables

Correlation in the order of IRs

Table 4. Kendall’s Τ rank correlations between time points for models predicting consumer’s frequency of hand washing with soap and water during the COVID-19 pandemic.

Attitudes-related predicting variables: COVID-19 versus FBI

Table 5. The Dcase and Median DA at each timepoint.

Discussion

Importance of attitude-related variables

Monthly change in COVID-19 cases and food safety-related behaviors

Limitation

Conclusion

Supplemental Material

Disclosure statement

Supplementary material

Additional information

Funding

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature