2,079
Views
38
CrossRef citations to date
0
Altmetric
Applications and Case Studies

Disentangling Bias and Variance in Election Polls

, , &
Pages 607-614 | Received 01 Jun 2016, Published online: 25 Jul 2018
 

ABSTRACT

It is well known among researchers and practitioners that election polls suffer from a variety of sampling and nonsampling errors, often collectively referred to as total survey error. Reported margins of error typically only capture sampling variability, and in particular, generally ignore nonsampling errors in defining the target population (e.g., errors due to uncertainty in who will vote). Here, we empirically analyze 4221 polls for 608 state-level presidential, senatorial, and gubernatorial elections between 1998 and 2014, all of which were conducted during the final three weeks of the campaigns. Comparing to the actual election outcomes, we find that average survey error as measured by root mean square error is approximately 3.5 percentage points, about twice as large as that implied by most reported margins of error. We decompose survey error into election-level bias and variance terms. We find that average absolute election-level bias is about 2 percentage points, indicating that polls for a given election often share a common component of error. This shared error may stem from the fact that polling organizations often face similar difficulties in reaching various subgroups of the population, and that they rely on similar screening rules when estimating who will vote. We also find that average election-level variance is higher than implied by simple random sampling, in part because polling organizations often use complex sampling designs and adjustment procedures. We conclude by discussing how these results help explain polling failures in the 2016 U.S. presidential election, and offer recommendations to improve polling practice.

Acknowledgments

The survey weights discussed in Footnote 2 are based on polls obtained from the iPOLL Databank provided by the Roper Center for Public Opinion Research at Cornell University. The data and code to replicate our results are available online at https://github.com/5harad/polling-errors.

Notes

1 One common technique for setting survey weights is raking, in which weights are defined so that the weighted distributions of various demographic features (e.g., age, sex, and race) of respondents in the sample agree with the marginal distributions in the target population (Voss, Gelman, and King Citation1995).

2 For a sampling of 96 polls for 2012 senate elections, only 19 reported margins of error higher than what one would compute using the SRS formula, and 14 of these exceptions were accounted for by YouGov, a polling organization that explicitly notes that it inflates variance to adjust for the survey weights. Similarly, for a sampling of 36 state-level polls for the 2012 presidential election, only 9 reported higher-than-SRS margins of error. Complete survey weights are available for 21 ABC, CBS, and Gallup surveys conducted during the 2012 election and deposited into Roper Center’s iPOLL. To account for the weights in these surveys, standard errors should on average be multiplied by 1.3 (with an interquartile range of 1.2–1.4 across the surveys), compared to the standard errors assuming SRS.

3 Most reported margins of error assume estimates are unbiased, and report 95% confidence intervals of approximately ± 3.5 percentage points for a sample of 800 respondents. This in turn implies the RMSE for such a sample is approximately 1.8 percentage points, about half of our empirical estimate of RMSE. As discussed in Footnote 3, many polling organizations do not adjust for survey weights when computing uncertainty estimates, which in part explains this gap.

4 Assuming N to be the number of polls, for each poll i ∈ {1, …, N}, let yi denote the two-party support for the Republican candidate, and let vr[i] denote the final two-party vote share of the Republican candidate in the corresponding election r[i]. Then, RMSE is 1Ni=1N(yi-vr[i])2.

5 To clarify our notation, we note that for each poll i, r[i] denotes the election for which the poll was conducted, and αr[i], βr[i], and τr[i] denote the corresponding coefficients for that election. Thus, for each election j, there is one (αj, βj, τj) triple. Our model allows for a linear time trend (βj), but we note that our empirical results are qualitatively similar even without this term.

6 To calculate these numbers, we removed an extreme outlier that is not shown in , which corresponds to polls conducted in Utah in 2004. There are only two polls in the dataset for each race in Utah in 2004.

Additional information

Funding

This work is supported in part by DARPA under agreement number D17AC00001, and ONR grant N00014-15-1-2541.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 343.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.