Publication Cover
Critical Review
A Journal of Politics and Society
Volume 28, 2016 - Issue 3-4
650
Views
6
CrossRef citations to date
0
Altmetric
Original Articles

Not a New Gold Standard: Even Big Data Cannot Predict the Future

 

ABSTRACT

Many scholars believe that the proliferation of large-scale datasets will spur scientific advancement and help us to predict the future using sophisticated statistical techniques. Indeed, a team of researchers achieved astonishing success using the world’s largest event dataset, produced by the icews project, to predict complex social outcomes such as civil wars and irregular government turnovers. However, the secret of their success lay in transforming epistemically difficult questions into easy ones. Forecasting the onset of civil wars becomes an easy task if one relies on explanatory variables that measure how often newspapers report on tensions, fights, or killings. But news reports on prewar conflicts are just variations of the variable that researchers want to predict; the finding that more conflicts are likely to occur when journalists report about conflicts carries little scientific value. A similar success rate in “predicting” interstate wars can also be achieved by a simple Google News search for country names and conflict-related news shortly just before a conflict is coded as a war. Big data can help researchers to make predictions in simple situations, but there is no evidence that predictions will also succeed in uncertain environments with complex outcomes—such as those characteristic of politics.

Notes

1. Facebook Entry, Nassim Nicholas Taleb, 2 February, 2015. https://www.facebook.com/permalink.php?story_fbid=10152794640733375&id=13012333374

2. The icews dataset was recently made available to the public (Boschee et al. Citation2015).

3. Official estimates count 85 deaths and 1,813 injured, while unofficial sources claim an even higher toll (Nidhi Citation2012, 14).

4. Nate Silver's team also failed in their prediction for the UK election. None of their prediction intervals included the true number of seats for the four biggest parties (Lauderdale Citation2015).

5. These studies are ignored by Metternich et al. (Citation2013) in its discussion on Thailand's conflicts.

6. The UCDP definitions can be found at http://www.pcr.uu.se/research/ucdp/definitions.

7. Moreover, Ward et al. (Citation2013b)'s dataset is based on monthly data for forecasts over a six-month period. This reduces the difficulty of making predictions even further, as the escalation phase of a conflict usually lasts longer than a month before it is coded as a civil war. It is thus imaginable that the UCDP conflict count is already close to the 25-death threshold for a civil war in a given month, allowing for a relative easy prediction based on current trends as to whether the threshold will be surpassed in the next six months.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.