ABSTRACT
The varying coefficient (VC) model introduced by Hastie and Tibshirani [Citation26] is arguably one of the most remarkable recent developments in nonparametric regression theory. The VC model is an extension of the ordinary regression model where the coefficients are allowed to vary as smooth functions of an effect modifier possibly different from the regressors. The VC model reduces the modelling bias with its unique structure while also avoiding the ‘curse of dimensionality’ problem. While the VC model has been applied widely in a variety of disciplines, its application in economics has been minimal. The central goal of this paper is to apply VC modelling to the estimation of a hedonic house price function using data from Hong Kong, one of the world's most buoyant real estate markets. We demonstrate the advantages of the VC approach over traditional parametric and semi-parametric regressions in the face of a large number of regressors. We further combine VC modelling with quantile regression to examine the heterogeneity of the marginal effects of attributes across the distribution of housing prices.
AMS CLASSIFICATION:
Acknowledgments
We thank Charles Leung, Chris Parmeter, two anonymous referees and participants at the 80th International Atlantic Economic Conference held in Boston, October 2015, where an earlier version of this paper was presented, for helpful comments. Thanks are also due to the Centaline Property Agency Ltd. for supplying the data used in this study. All remaining errors are ours.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 Within the conventional nonparametric framework, categorical variables are handled by a frequency-based approach that splits the sample into cells. However, when the number of cells is large, each cell may have insufficient observations to estimate nonparametrically the relationship among the remaining continuous variables. For this reason most empirical studies involving discrete regressors use a semi-parametric approach.
2. Refinements to the additive model allowing for discrete regressors have been considered by Fan et al. [Citation17], and Camlong-Viot et al. [Citation8].
3. The VC model considered here must not be confused with a different model bearing the same name considered by Knight et al. [Citation32]. The latter is essentially a parametric linear regression model that allows the regression coefficients to vary.
4. The sub-indices are referred to as ‘adjusted unit prices’ within the CCI system.
5. See http://hk.centadata.com/cci/cci_e.htm, Bao and Wan [Citation2] and Bucchianeri [Citation4] for a description of the index.
6. We choose the semi-log specification based on results of the Box-Tidwell test [Citation35] with LGAREA and U
as transformation variables in the model. The test suggests
.
7. The multifold cross-validation criterion selects the optimal bandwidth h that minimises the average mean-squared (AMS) error: , where
, and
's are computed based on the sub-sample
. In our analysis, we set
and
to obtain the bandwidth for QR.
8. We have made available to readers the Matlab codes for estimating the VC and VC-QR models used in our analysis. The program may be downloaded from http://personal.cb.cityu.edu.hk/msawan/research.htm