ABSTRACT
The academic community has discussed using Automated Valuation Models (AVMs) in the context of traditional real estate valuations and their performance for several decades. Most studies focus on finding the best method for estimating property values. One aspect that has not yet to be studied scientifically is the appropriate choice of the spatial training level. The published research on AVMs usually deals with a manually defined region and fails to test the methods used on different spatial levels. Our research aims to investigate the impact of training AVM algorithms at different spatial levels regarding valuation accuracy. We use a dataset with 1.2 million residential properties from Germany and test four methods: Ordinary Least Square, Generalised Additive Models, eXtreme Gradient Boosting and Deep Neural Network. Our results show that the right choice of spatial training level can significantly impact the model performance, and that this impact varies across the different methods.
Disclosure statement
No potential conflict of interest was reported by the authors.
Data availability statement
Due to the nature of this research, participants of this study did not agree for their data to be shared publicly, so supporting data is not available.
Notes
1. To avoid a structural break within the dataset, the data should ideally come from one source or have been collected according to the same criteria.
2. The models could not be analysed at an even smaller spatial level because of data availability.
3. The Top-7 are the most important cities in Germany, namely Berlin, Munich, Hamburg, Frankfurt, Cologne, Dusseldorf and Stuttgart. Their importance is based on their market size and market activity. They can be seen as the most liquid and dynamic real estate markets in Germany.
4. Table A1 in Appendix I explaines the individual variables.
5. The correlation matrix is available on request.
6. Applies if the property is both partly owner-occupied and partly non-owner-occupied (e.g. single-family home with an attached rental unit).
7. The assessment of the two variables, ‘condition’ and ‘quality grade’, was performed by professional appraisers during the property inspection process.
8. Acxiom is an American provider of international macroeconomic and microeconomic data. Further information can be found at: https://www.acxiom.com/.
9. Further information about the NUTS nomenclature can be found at https://ec.europa.eu/eurostat/web/nuts/background.
10. The train/test split was selected to include as much data as possible in the training set, as some NUTS-3 regions have limited data. If an alternative split were employed, it would result in inconsistent results for these regions. Our study focuses on a nationwide comparison, rather than individual metropolitan regions, which typically have a large and dense amount of data. This approach is consistent with other studies that compare algorithms on a national scale, such as Stang et al. (2022).
11. However, unlike in our study, these authors only work at a county level and only vary the data available within the county. In our case, the amount of data is varied by adding observations from other spatial levels.
Additional information
Notes on contributors
Bastian Krämer
Bastian Krämer is a doctoral student and project team member in the Real Estate Management group at the International Real Estate Business School (IRE|BS) of the University of Regensburg, Germany. Bastian holds a master’s degree in mathematics from the Technical University of Munich, Germany. His research focuses on applications of machine learning in the field of real estate.
Moritz Stang
Moritz Stang is a doctroal student and project team member in the Real Estate Management group at the International Real Estate Business School (IRE|BS) of the University of Regensburg, Germany. Moritz holds a master’s degree in Real Estate Management from the Real Estate Business School (IRE|BS) of the University of Regensburg. His research objective lays on Automated Valuation Models (AVMs) and the use of Big Data solutions in the real estate industry.
Vanja Doskoč
Vanja Doskoč is a doctoral student and project team member in the Algorithm Engineering group of the Hasso Plattner Institute (HPI). Vanja obtained a master’s degree in technical mathematics from the TU Wien (Vienna University of Technology), Austria. In his research he focuses on understanding behaviourally correct language learning under various restrictions as well as applications of deep learning and evolutionary algorithms on different real world problems.
Wolfgang Schäfers
Wolfgang Schäfers is professor and Chair of the Department of Real Estate Management at the International Real Estate Business School (IRE|BS) of the University of Regensburg. His research focuses on real estate valuation, applications of machine learning and sentiment analysis.
Tobias Friedrich
Tobias Friedrich is a professor at the University of Potsdam and the head of the Algorithm Engineering group of the Hasso Hasso Plattner Institute (HPI). His research interest lies in algorithm engineering, probabilistic methods, artificial intelligence, data science, network science, distributed algorithms and graph algorithms.