Abstract
Dietary factors show different effects on genetically diverse populations. Scientific research uses gene-environment interaction models to study the effects of dietary factors on genetically diverse populations for lung cancer risk. However, previous study designs have not investigated the degree of type I error inflation and, in some instances, have not corrected for multiple testing. Using a motivating investigation of diet-gene interaction and lung cancer risk, we propose a training and testing strategy and perform real-world simulations to select the appropriate statistical methods to reduce false-positive discoveries. The simulation results show that the unconstrained maximum likelihood (UML) method controls the type I error better than the constrained maximum likelihood (CML). The empirical Bayesian (EB) method can compete with the UML method in achieving statistical power and controlling type I error. We observed a significant interaction between SNP rs7175421 with dietary whole grain in lung cancer prevention, with an effect size (standard error) of −0.312 (0.112) for EB estimate. SNP rs7175421 may interact with dietary whole grains in modulating lung cancer risk. Evaluating statistical methods for gene-diet interaction analysis can help balance the statistical power and type I error.
Acknowledgments
The research data are from the dbGaP accession phs001286.v1.p1. We thank the National Cancer Institute (NCI) for access to NCI's data collected by the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial, funded in whole or in part with federal funds from the NCI US National Institutes of Health (NIH). The datasets have been accessed through the NIH database for Genotypes and Phenotypes (dbGaP). The statements contained herein are solely those of the authors and do not represent or imply concurrence or endorsement by NCI.
Authors’ Contributions
JT and HP designed the study, applied for and analyzed the data, and wrote the article. WL helped apply for data and supervised the progress of the study. AEH, JH, and WL were involved in data interpretation. All authors critically revised the article and approved the final manuscript.
Data Availability Statement
The data can be accessed via an application on the dbGaP website: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001286.v1.p1
Ethical Approval
The study is approved by the University of Hong Kong Institutional Review Board (HKU/HA HKW IRB No. UW 18-577).
Disclosure Statement
HP reports personal fees from Genentech outside the submitted work. Other authors report there are no competing interests to declare.