Abstract
Modeling and inference for heterogeneous data have gained great interest recently due to rapid developments in personalized marketing. Most existing regression approaches are based on the conditional mean and may require additional cluster information to accommodate data heterogeneity. In this article, we propose a novel nonparametric resolution-wise regression procedure to provide an estimated distribution of the response instead of one single value. We achieve this by decomposing the information of the response and the predictors into resolutions and patterns, respectively, based on marginal binary expansions. The relationships between resolutions and patterns are modeled by penalized logistic regressions. Combining the resolution-wise prediction, we deliver a histogram of the conditional response to approximate the distribution. Moreover, we show a sure independence screening property and the consistency of the proposed method for growing dimensions. Simulations and a real estate valuation dataset further illustrate the effectiveness of the proposed method.
Supplementary Materials
Supplement materials of this article include additional simulation studies, additional proofs and R code.
Disclosure Statement
The authors declare that they have no competing financial interests or personal relationships that could have appeared to influence the work reported in this article.
Funding
Acknowledgments
The authors would like to thank the editor, the associate editor, and two anonymous referees for their valuable comments and suggestions.