2,650
Views
27
CrossRef citations to date
0
Altmetric
Articles

Instrumental Variables Two-Stage Least Squares (2SLS) vs. Maximum Likelihood Structural Equation Modeling of Causal Effects in Linear Regression Models

Pages 876-892 | Published online: 12 Jun 2019
 

Abstract

In the presence of omitted variables or similar validity threats, regression estimates are biased. Unbiased estimates (the causal effects) can be obtained in large samples by fitting instead the Instrumental Variables Regression (IVR) model. The IVR model can be estimated using structural equation modeling (SEM) software or using Econometric estimators such as two-stage least squares (2SLS). We describe 2SLS using SEM terminology, and report a simulation study in which we generated data according to a regression model in the presence of omitted variables and fitted (a) a regression model using ordinary least squares, (b) an IVR model using maximum likelihood (ML) as implemented in SEM software, and (c) an IVR model using 2SLS. Coverage rates of the causal effect using regression methods are always unacceptably low (often 0). When using the IVR model, accurate coverage is obtained across all conditions when N = 500. Even when the IVR model is misspecified, better coverage than regression is generally obtained. Differences between 2SLS and ML are small and favor 2SLS in small samples (N ≤ 100).

Notes

1 In this article we consider the joint modeling of y, x, and z as commonly described in the SEM literature. We could have considered instead estimation of y and x conditional on z, in which case z may include binary variables, which are commonly used in applied research.

2 This is (N–1)/N times the sample covariance matrix, where N denotes sample size.

3 It would be correctly specified if βyv=0 or βxv=0 or equivalently Ψyx=0.

4 Since \user\psiiyx=\user β˜yv\user β˜zv\user \psii˜vv and we use \psii˜vv=1, these values lead to population values in the equivalent IVR model of \psiiyx = .1 and .2 (i.e., these are the population covariance values between the disturbances of the predictor and outcome).

5 There is a single degree of freedom available for testing. The usual recommended cutoffs for the RMSEA (Browne & Cudeck, Citation1993) should not be used to gauge the magnitude of model misfit when there are so few degrees of freedom (Kenny, Kaniskan, & McCoach, Citation2015). For instance, population RMSEAs ranged from .125 to .532 with an average of .24, suggesting extraordinarily poor fit. In contrast, SRMR values suggest that the models fit closely.

6 When there is a single outcome y, as in the simulation studies reported here, the ML estimator of the IVR model has a closed form solution (Anderson & Rubin, Citation1949, Citation1950) and it is referred to in the Econometrics literature as limited information ML (LIML estimator) –see Davidson and MacKinnon (Citation2004) for technical details. However, in this article, we obtained the ML solution iteratively, as implemented in SEM software. This is referred to in the Econometrics literature as full information ML.

7 Provided it yields consistent estimates.

Additional information

Funding

This research was supported by the National Science Foundation under Grant No. SES-1659936.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 412.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.