Alternatives to logistic regression models in experimental studies: The Journal of Experimental Education: Vol 90 , No 1

Abstract

Experiments in psychology or education often use logistic regression models (LRMs) when analyzing binary outcomes. However, a challenge with LRMs is that results are generally difficult to understand. We present alternatives to LRMs in the analysis of experiments and discuss the linear probability model, the log-binomial model, and the modified Poisson regression model. A Monte Carlo simulation assessed bias in point estimates and standard errors as well as power and Type I error rates of the different methods. Findings show that the linear probability and the modified Poisson regression models are valid, unbiased, and in some cases, better alternatives to the LRM when the predictor of interest is a binary variable. An applied example is provided as well.

KEYWORDS:

Acknowledgements

Survey data for the applied section were collected in cooperation with the Virginia Department of Criminal Justice Services. This project was supported in part by Grant 2012-JF-FX-0062 awarded by the U.S. Department of Justice, Office of Justice Programs, Office of Juvenile Justice and Delinquency Prevention. The opinions, findings, and conclusions or recommendations expressed in this report are those of the authors and do not necessarily reflect those of the U.S. Department of Justice or the Virginia Department of Criminal Justice Services.

Notes

1 The availability of modern computing power may make this seems like a trivial point but fitting a multilevel logistic regression model in R using either nlme (Pinheiro, Bates, DebRoy, Sarkar, & R Core Team, Citation2014) or lme4 (Bates, Mächler, Bolker, & Walker, Citation2015) may take hours compared to seconds when using a multilevel linear model instead using the same packages.

2 See Horrace and Oaxaca (Citation2006) for a competing point of view.

3 We note that there are other methods that may involve binary outcomes such as discriminant function analysis (Lei & Koehly, Citation2003) and classification and regression trees (Finch & Schneider, Citation2007) though our particular interest is in evaluating and quantifying treatment effects and not classification or prediction.

4 We use this example with our students to provide a practice case after discussing the differences of logits, odds, odds ratios, and probabilities.

5 Constraining values to a minimum or maximum is not new. For example, in computing the effect size of omega squared used in ANOVA tests, it is possible to have negative values. However, a negative proportion of variance is not possible and in such cases, negative values are often suggested to be merely set to zero (Troncoso Skidmore & Thompson, Citation2013). Another example is the presence of Heywood cases in structural equation modeling where communality estimates equal or exceed one. A common ‘solution’ is fixing a negative error variance to zero (Gerbing & Anderson, Citation1987).

6 This is also a suggested remedy used in logistic regression where the assumption of linearity in the logits is violated (Hosmer & Lemeshow, Citation2004).

7 For example, to illustrate the effect of using random lottery assignments (when demand is greater than the number of spots available in the school) to make an offer to individuals to enroll in charter schools, Angrist and Pischke (Citation2014) estimate how making an offer (1 = offered, 0 =not offered) predicts treatment takeup (1 = enrolls in school, 0 = does not enroll in school), which is then associated with improved mathematics outcomes. Logistic regression may also be used with IV estimation (Foster, Citation1997) though the use of dichotomous outcomes has received limited attention (Vansteelandt, Bowden, Babanezhad, & Goetghebeur, Citation2011).

8 As a proposed remedy to the replication crisis in psychology, researchers are encouraged to preregister the proposed analysis prior to actual data collection (Spellman, Citation2015). However, if researchers are unsure if the log-binomial model will work, committing to the method is difficult.

9 Because log-binomial models produce similar results and at times require workarounds for convergence (Donoghoe & Marschner, Citation2018; Williamson et al., Citation2013), we focused instead on using the modified Poisson regression model for the current study. One method using a log-binomial model uses start values produced using a model estimated using a Poisson regression model but then requires a two-step process when one would be sufficient. Another option would be to use a quasi-Poisson regression instead.

10 Chen et al. (Citation2018) used a Medline search of articles from 2005 to 2014 and tracked the growth of articles using log-binomial and modified Poisson regression models that dealt specifically with binary outcomes. The use of log-binomial and Poisson regression models had a compounded annual growth rate (CAGR) of approximately 25% and 46%, respectively. In 2014, 75 articles used log-binomial models and 85 articles used modified Poisson regression models and these articles focused specifically on binary outcomes. In contrast, our search using the American Psychological Association PsycNET database in the past five years found only one match for the modified Poisson regression (Robertson, King-Kallimanis, & Kenny, Citation2016) and none for the log-binomial models showing the relative underuse of these methods. Note: outcomes using negative binomial models, zero-inflated models, or Poisson models for counts or rates were excluded as the focus of the current manuscript is on dichotomous outcomes.

11 We also attempted to fit a log-binomial model using the logbin package (Donoghoe & Marschner, Citation2018) which greatly aids with convergence issues. However, for the full model, convergence still failed after approximately over 15 minutes of computing time (using a combinatorial expectation-maximization algorithm) while all the other methods took less than a second. In the original manuscript, a general bullying measure was also used as a predictor which we omitted here to avoid issues of separation (Allison, Citation2004).

Alternatives to logistic regression models in experimental studies

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Alternatives to logistic regression models in experimental studies

Abstract

Acknowledgements

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature