In-Sample or Out-of-Sample Tests of Predictability: Which One Should We Use?: Econometric Reviews: Vol 23 , No 4

Sample our Economics, Finance,Business & Industry journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1081/ETC-200040785?needAccess=true

Abstract

It is widely known that significant in-sample evidence of predictability does not guarantee significant out-of-sample predictability. This is often interpreted as an indication that in-sample evidence is likely to be spurious and should be discounted. In this paper, we question this interpretation. Our analysis shows that neither data mining nor dynamic misspecification of the model under the null nor unmodelled structural change under the null are plausible explanations of the observed tendency of in-sample tests to reject the no-predictability null more often than out-of-sample tests. We provide an alternative explanation based on the higher power of in-sample tests of predictability in many situations. We conclude that results of in-sample tests of predictability will typically be more credible than results of out-of-sample tests.

Key Words:

Predictive ability
Spurious inference
Data mining
Model instability

Mathematics Subject Classification:

Acknowledgment

We have benefited from comments at the 2002 European Econometric Society Meeting, the 2002 NBER Summer Institute and the 2002 EC² Conference. We also thank seminar participants at Bocconi, Bonn, CORE, the European Central Bank, Exeter, Helsinki, INSEAD, Leuven, Montreal, Pittsburgh, Pompeu Fabra, Southampton, Tokyo Metropolitan, Tokyo, Warwick, Waseda, Yokohama National and York. We especially thank two anonymous referees, Valentina Corradi, Todd Clark, Frank Diebold, Robert Engle, Scott Gilbert, Clive Granger, Alastair Hall, Kirstin Hubrich, Michael McCracken, Peter Reinhard Hansen, Barbara Rossi, Norman Swanson, and Ken West for helpful discussions. Part of this research was conducted while the second author served as an adviser at the European Central Bank (ECB). The views expressed in this paper do not necessarily reflect the opinion of the ECB or its staff.

Notes

^aThis paper does not deal with forecast accuracy tests for nonnested models (see, e.g., West, Citation1996).

^bWe focus on asymptotic results because finite-sample size distortions in practice can be effectively eliminated by the use of bootstrap methods (see, e.g., Clark and McCracken, Citation2004; Kilian, Citation1999; Kilian and Taylor, Citation2003; Mark, Citation1995; Rapach and Wohar, Citation2003).

^cMcCracken (Citation2001) studies out-of-sample inference involving forecasting models that in turn were selected based on some inconsistent model selection procedure. His methodology, however, presumes that no respecification of the forecast model occurs after the out-of-sample test is conducted. Thus, he rules out data mining of the form described here.

^dHansen (Citation2001) discusses some possible drawbacks of White's proposal. Note that these possible drawbacks do not apply in our context because our model is nested and the null hypothesis holds with equality.

^eOur analysis is a natural extension of work in classical statistics on the testing of multiple hypotheses (see, e.g., Anderson, Citation1994; Dasgupta and Spurrier, Citation1997; Royen, Citation1984). A similar framework has also been used by Hansen (Citation2000) who proposed bootstrap inference for the distribution of R ² in the presence of data mining.

^fThere is one counterexample to this tendency, in which out-of-sample tests will tend to have higher power than in-sample tests: Suppose that the break in β occurs at exactly [λT] where λ = 0.5. Further suppose that in the first half of the sample β = − c and in the second half β = c where c is some constant. In that case, the in-sample test will have zero power asymptotically, whereas the out-of-sample test will have some power. This counterexample, however, seems more of an intellectual curiosity because it requires three unrealistic conditions. First, a switch in sign seems unlikely in situations that would suggest the use of a one-sided t-test, as is typically the case in applied work. Second, it is unlikely that the deviations from β = 0 exactly offset one another. Third, it is unlikely that the break occurs exactly at [0.5T]. Even for small deviations from these assumptions the counterexample breaks down.

^gA related test that is robust against dynamic misspecification has also been proposed by Chao et al. (Citation2001). Yet another test of predictability that allows for dynamic misspecification under the null is presented in Corradi and Swanson (Citation2002), but that paper focuses on the problem of testing equal predictive accuracy of two nested models against the alternative of possibly nonlinear predictability.

West , K. D. 1996 . Asymptotic inference about predictive ability . Econometrica , 64 : 1067 – 1084 .

Web of Science ®Google Scholar

Clark , T. E. and McCracken , M. W. 2004 . Evaluating long-horizon forecasts. Manuscript. Department of Economics, University of Missouri at Columbia

Google Scholar

Kilian , L. 1999 . Exchange rates and monetary fundamentals: What do we learn from long-horizon regressions? . J. Appl. Econometrics , 14 : 491 – 510 .

Web of Science ®Google Scholar

Kilian , L. and Taylor , M. P. 2003 . Why is it so difficult to beat the random walk forecast of exchange rates? . J. Int. Econ. , 6 : 85 – 107 .

Google Scholar

Mark , N. C. 1995 . Exchange rates and fundamentals: Evidence on long-horizon predictability . Amer. Econ. Rev. , 85 : 201 – 218 .

Web of Science ®Google Scholar

Rapach , D. E. and Wohar , M. E. 2003 . Valuation ratios and long-horizon stock price predictability. Forthcoming . J. Appl. Econometrics ,

Google Scholar

McCracken , M. W. 2001 . Data Mining and Out-of-Sample Inference. Manuscript. Department of Economics, Louisiana State University

Google Scholar

Hansen , P. R. 2001 . The Reality Check for Data Snooping: A Comment on White. Manuscript. Department of Economics, Brown University

Google Scholar

Anderson , T. W. 1994 . The Statistical Analysis of Time Series New York : Wiley .

Google Scholar

Dasgupta , N. and Spurrier , J. D. 1997 . A class of multivariate chi-square distributions with applications to comparisons with a control . Commun. Statist. Theory Method , 26 : 1559 – 1573 .

Web of Science ®Google Scholar

Royen , T. 1984 . Multivariate comparisons of polynomial distributions . Biometrical J. , 26 : 319 – 332 .

Web of Science ®Google Scholar

Hansen , P. R. 2000 . The Distribution of the Maximal R 2. Manuscript. Department of Economics, University of California at San Diego

Google Scholar

Chao , J. C. , Corradi , V. and Swanson , N. R. 2001 . An out-of-sample test for granger causality . Macroecon. Dyn. , 5 : 598 – 620 .

Web of Science ®Google Scholar

Corradi , V. and Swanson , N. R. 2002 . A consistent test for nonlinear out of sample predictive accuracy . J. Econometrics. , 110 : 353 – 381 .

Web of Science ®Google Scholar

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 61.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 578.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

In-Sample or Out-of-Sample Tests of Predictability: Which One Should We Use?

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

In-Sample or Out-of-Sample Tests of Predictability: Which One Should We Use?

Abstract

Acknowledgment

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature