103
Views
5
CrossRef citations to date
0
Altmetric
Original Articles

A Mixture Model Approach in Gene–Gene and Gene–Environmental Interactions for Binary Phenotypes

, , , , , , , , , , & show all
Pages 1150-1177 | Received 12 Sep 2007, Accepted 06 Feb 2008, Published online: 20 Nov 2008
 

Abstract

In translational research, a genetic association study of a binary outcome has a twofold aim: test whether genetic/environmental variables or their combinations are associated with a clinical phenotype, and determine how those combinations are grouped to predict the phenotype (i.e., which combinations have a similarly distributed phenotype, and which ones have differently distributed phenotypes). The second part of this aim has high clinical appeal, because it can directly facilitate clinical decisions. Although traditional logistic regression can detect gene-gene or gene-environmental interaction effects on binary phenotypes, they cannot decisively determine how genotype combinations are grouped to predict the phenotype. Our proposed mixture model approach is valuable in this context. It concurrently detects main and interaction effects of genetic and environmental variables through a likelihood ratio test (LRT) and conducts phenotype cluster analysis based on genetic and environmental variable combinations. The theoretical distribution of the proposed mixture model's likelihood ratio test is robust not only to small sample size but also to unequal sample size in various genotype and environmental subgroups. Hypothesis testing through a likelihood ratio test results in a fast algorithm for p -value calculations. Extensive simulation studies demonstrate that mixture model, overall test in logistic regression, and Monte Carlo based logic regression constantly possess the best power to detect multi-way gene/environmental combinations. The mixture model approach has the highest recovery probability to recover the true partition in the simulation studies. Its applications are exemplified in interim data analyses for two cancer studies.

ACKNOWLEDGMENTS

The research is sponsored by NIH grants, R01 GM74217 (LL) U-01 GM61373 (DF) and R-01 GM56898 (DF).

Notes

Per Comparison∗ type I error means the p-value for testing whether there is any difference in phenotype among combination cells defined by A pair of variables. Family-wise∗∗ type I error is defined as the p-value for testing whether there is any difference among combination cells defined by ANY pairs of variables.

Note: g = (1,…,9) corresponds to genotype cells AA/BB, AA/Bb, AA/bb, Aa/BB, Aa/Bb, Aa/bb, aa/BB, aa/Bb, and aa/bb, respectively; g 1 = (1, 2, 3) corresponds to genotypes AA, Aa, and aa, respectively; and g 2 = (1,2,3) corresponds to genotypes BB, Bb, and bb, respectively.

Note: (per-comparison p-value, family-wise p-value).

Note: mixture model per-comparison p-value: 1 vs. 2 (0.02); 2 vs. 3 (0.27).

Note: MM∗, mixture model; MDR∗∗, multi-dimension reduction. Checkboard, diagonal model, and three-way interaction use 50% phenocopy (PC).

∗P1 is the overall power under 5% family-wise type I error.

+P2 is the power to detect two causal SNP's epistatis model under 5% family-wise type I error.

#P3 is the matched power of selecting two causal SNP with cross-validation in logic regression.

$FP is the number of selected false positive SNP when all the methods share the matched power P3.

Note: Family-wise and comparison-wise type I error is equally controlled at 5% level, as one three-way interaction is tested.

Note: ∗Power is the main or interaction effect' power under each model, when the family wise type I error is control at 5% level.

#Matched power refers to the cross-validation logic regression's power in selecting main or interaction effects.

+FP is the number of false positive with matched power.

Note: LD interaction means the interaction SNP are in the simulated LD region of SNP 3 and 9. All the methods share the matched power derived from the cross-validation logic regression.

Note: per-comparison type I error is controlled at 5%.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.