296
Views
11
CrossRef citations to date
0
Altmetric
Original Articles

Forecasting house prices for the four census regions and the aggregate US economy in a data-rich environment

Pages 4677-4697 | Published online: 25 Jul 2013
 

Abstract

This article considers the ability of large-scale (involving 145 fundamental variables) time-series models, estimated by dynamic factor analysis and Bayesian shrinkage, to forecast real house price growth rates of the four US census regions and the aggregate US economy. Besides the standard Minnesota prior, we also use additional priors that constrain the sum of coefficients of the VAR models. We compare 1- to 24-months-ahead forecasts of the large-scale models over an out-of-sample horizon of 1995:01–2009:03, based on an in-sample of 1968:02–1994:12, relative to a random walk model, a small-scale VAR model comprising just the five real house price growth rates and a medium-scale VAR model containing 36 of the 145 fundamental variables besides the five real house price growth rates. In addition to the forecast comparison exercise across small-, medium- and large-scale models, we also look at the ability of the ‘optimal’ model (i.e. the model that produces the minimum average mean squared forecast error) for a specific region in predicting ex ante real house prices (in levels) over the period of 2009:04 till 2012:02. Factor-based models (classical or Bayesian) perform the best for the North East, Mid-West, West census regions and the aggregate US economy and equally well to a small-scale VAR for the South region. The ‘optimal’ factor models also tend to predict the downward trend in the data when we conduct an ex ante forecasting exercise. Our results highlight the importance of information content in large number of fundamentals in predicting house prices accurately.

JEL Classification:

Acknowledgements

The paper was improved markedly by the comments from an anonymous referee. The usual disclaimer applies.

Notes

1 Refer to Fig. 1 for details regarding different states that are included within a specific census region.

2 Factor models require the use of stationary data; hence, we forecast real house price growth rates, rather than real house prices in levels. Though, BVAR models does not necessarily require stationary data in its estimation process as stationarity or nonstationarity of a variable can be handled through appropriate prior specification, for the sake of appropriate comparison with the factor models and the classical small-scale VAR and random walk models, we also use real house price growth rates in the medium- and large-scale BVAR models. In addition, given that the house price for the aggregate US economy is obtained from a weighted combination of the house prices of the four regions, one is likely to detect high degree of multicollinearity if one uses all the five house prices together in levels in a classical VAR. This problem of high correlation of the four census regions house prices with that of the US economy is reduced (at least by 25%) by using the real house price growth rates instead of their corresponding levels. Thus, allowing us to use the growth rates of the five prices together in a small-scale VAR. Note that, besides the fact that factor models need the data to be stationary, these models are also difficult to interpret, since the factors are latent. Specifically, the factors are some combination of the different variables in the data set. However, the advantage of using factor models to forecast house prices is that many high frequency data potentially useful to predict house prices are volatile. This is particularly the case of variables linked to construction, which are affected inter alia by climate conditions. Factor analysis, thus, allows to reduce the noise in the individual variables.

3 Similar results were also obtained by Das et al. (Citation2009, Citation2011) when forecasting house prices in South Africa.

4 For further details, refer to Section III, which summarizes the data and the Appendix of the paper, which details the data used for the medium- and large-scale models.

5 Note that causality runs in both directions, in the sense that, construction and development in a specific area is also likely to raise house prices owing to higher demand for housing.

6 These two studies used the monthly data on house prices, starting in January 1991, obtained from the Federal Housing Finance Agency (previously, the Office of Federal Housing Enterprise Oversight) for the aggregate US economy and the nine census divisions.

7 The reader is referred to Gupta and Das (Citation2008), Das et al. (Citation2011), Gupta et al. (Citation2011a), and Gupta and Miller (Citation2012) for studies that uses variation of the Minnesota prior to account for spatial interdependence between house prices of neighbouring and nonneighbouring regions in forecasting these prices.

8 This section relies heavily on the discussion available in Banbura et al. (Citation2010). We also retain their mathematical symbols in representing the equations.

9 This section relies heavily on the discussion available in Gupta et al. (Citation2011a, b).

10 Details of these results are available upon request from the authors.

11 We also used an even smaller BVAR model comprising of personal income, industrial production, CPI, total private employees on nonfarm payrolls, M2, short- and long-term interest rate. The results were qualitatively similar in the sense, that there always existed at least one-type of large-scale model that outperformed this BVAR model for each of the five real house price growth rates. The details of these results are available upon request from the author.

12 LeSage (Citation1990), using a system of four variables for 50 industries comprising of three labour market variables and the industrial output, highlighted that Bayesian vector error correction models (BVECMs) tended to perform better than classical VAR, VECM and BVARs, since it allowed not only short-run interactions amongst the variables, thus capturing the dynamics of the system, but also modelled the deviation from the long-run equilibrium relationship. The sum of coefficients prior tends to do the same for larger-scale models, and hence, the improvement of the forecasting performance over the BVARs based on the Minnesota prior is understandable.

13 We ensure that the forecasted values and the actual values in all the figures start from the same point by replacing the actual values for 2009:03 in the forecasted series.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 387.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.