309
Views
1
CrossRef citations to date
0
Altmetric
Article

Applying machine learning methods to model social interactions in alcohol consumption among adolescents

, ORCID Icon &
Pages 436-443 | Received 03 Sep 2020, Accepted 03 Feb 2021, Published online: 22 Feb 2021
 

Abstract

Background: Existing research using machine learning to investigate alcohol use among adolescents has largely neglected peer influences and tended to rely on models which selected predictors based on data availability, rather than being guided by a unifying theoretical framework. In addition, previous models of peer influence were typically estimated by using traditional regression techniques, which are known to have worse fit compared to the models estimated using machine learning methods.

Methods: Addressing these limitations, we use three machine-learning algorithms to fit a theoretical model of social interactions in alcohol consumption. The model is fit to a large, nationally representative sample of U.S. school-aged adolescents and accounts for various channels of peer influence.

Results: We find that extreme gradient boosting is the best performing algorithm in predicting alcohol consumption. After the algorithm ranks, the explanatory variables by their importance in classification, previous year drinking status, misperception about friends’ drinking, and average actual drinking among friends are the most important predictors of adolescent drinking.

Conclusions: Our findings suggest that an effective intervention should focus on school peers and adolescents’ perceptions about drinking norms, in addition to the history of alcohol use. Our study may also increase interest in theory-driven selection of covariates for machine-learning models.

Acknowledgments

This research uses data from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Special acknowledgment is due Ronald R. Rindfuss and Barbara Entwisle for assistance in the original design. Information on how to obtain the Add Health data files is available on the Add Health website (http://www.cpc.unc.edu/addhealth). No direct support was received from grant P01-HD31921 for this analysis.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed consent statement

This study used secondary data analysis and no research involving human subjects was done; therefore no informed consent to participate in the study was obtained.

Disclosure statement

The authors declare that they have no conflict of interest.

Data availability statement

The data that support the findings of this study are available from The Carolina Population Center at The University of North Carolina – Chapel Hill (UNC) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of The Carolina Population Center at UNC.

Notes

1 Computer-assisted personal interviewing of Add Health was used to ensure confidentiality of responses and reduce reporting bias.

2 The paper by Amialchuk et al. (Citation2019) introduced this normative misperception score and focused on estimating the effect of normative misperception on the use of three substances (alcohol, marijuana, and smoking) using a linear regression. The present paper focuses on using machine-learning methods to fit a social interaction model which incorporates several mechanisms of peer influence.

3 See Kuhn and Johnson (Citation2013) and Hastie et al. (Citation2016) for an overview of these methods, and why random forest and extreme boosting are some of the better, more accurate machine learning algorithms.

Additional information

Funding

The authors received no specific funding for this work.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.