Abstract
The most commonly used econometric models for time-on-market in housing studies force researchers into a tradeoff between problems of nonlinearity, which may be addressed with a hazard model, and endogeneity, which may be addressed with a two-stage least squares model. This study introduces two modeling approaches—two-stage predictor substitution and two-stage residual inclusion—into the real estate literature. Each approach is able to address both nonlinearity and endogeneity in a single specification. Fit statistics consistently prefer the new methodologies to either two-stage least squares or hazard models of time-on-market. In the current sample, several commonly observed results are changed using the newer, more appropriate models. Some commonly accepted results are reversed, while others are reinforced. In addition to being produced by a more econometrically sound technique, the new results have the added benefit of being almost universally more intuitively appealing than previous results.
Notes
1 A total of 17,685 observations in the dataset have sufficient information for use in the hedonic pricing model. However, in the interest of consistency, only observations available for all models are used in the estimations reported here.
2 It is worth noting, however, that the estimation approaches used below can address endogeneity in any nonlinear model, not just the Weibull model.
3 Zij includes ATYP, DOP, NOMKT, and POOL, which were excluded from Xij , and excludes NC and the five parking-related variables Xij includes. These latter variables serve as instruments for the 2SLS, 2SPS, and 2SRI estimators discussed below.
4 The approaches discussed below have been used extensively in the labor and health economics literature, including work by Norton et al. (1998), DeSimone (2002), Norton and Van Houtven (2006), Shin and Moon (2007), and Lindrooth and Weisbrod (2007).
5 In practice, any consistent estimator of LnSP would suffice as a first stage for generating the predicted values. Since the 2SLS estimates from the pricing model are consistent estimators, they are used for this purpose in the current study.
6 In this case, 2SLS would also fail to produce consistent estimates of the parameters of the nonlinear true model.
7 The degree of difference between the two methods will depend on the particular application. When the two methods significantly differ, 2SRI likely provides the more reliable estimates. This is not an issue with the data used for this study.
8 Since it is not appropriate to interpret the coefficient on the included residual, this coefficient is not included in Table 4.
9 The STATA programming code for utilizing the two new methodologies with the bootstrapped standard error correction is available from the authors upon request.
10 Other criteria, such as the stability of coefficient estimates when a small random subsample of the data is used for estimation, were considered by the authors. These criteria are less broadly used than those presented here but were consistent in preferring the 2SPS and 2SRI estimators over the 2SLS or hazard estimators.
11 Other alternative model specifications were considered, dropping subsets of the variables included in Table 4 and excluded from Table 9. These results are consistent with the observation that 2SPS and 2SRI are more robust to model misspecification than 2SLS and Hazard, and are available from the authors on request.