Abstract
In many countries, real estate appraisal is based on conventional methods that rely on appraisers’ abilities to collect data, interpret it and model the price of a real estate property. With the increasing use of real estate online platforms and the large amount of information found therein, there exists the possibility of overcoming many drawbacks of conventional pricing models such as subjectivity, cost, unfairness, among others. In this paper we propose a data-driven real estate pricing model based on machine learning methods to estimate prices reducing human bias. We test the model with 178,865 flats listings from Bogotá, collected from 2016 to 2020. Results show that the proposed state-of-the-art model is robust and accurate in estimating real estate prices. This case study serves as an incentive for local governments from developing countries to discuss and build real estate pricing models based on large data sets that increases fairness for all the real estate market stakeholders and reduces price speculation.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 Although this is not needed for decision-tree based models such as the one used in this work, it is helpful to compare different variables on the same ground, especially for the price error estimation method described later in this section.
2 A weighted version can be also used: the distance of each individual feature can be multiplied by a weight corresponding to each feature, for instance, the feature importance that can be computed with XGBoost, or SHAP feature importances (see end of Section 3).