Modern approaches to building interpretable models of the property market using machine learning on the base of mass cadastral valuation
This work addresses the challenge of creating interpretable property valuation models for practical applications like legal matters, but it is incremental as it applies existing methods to a specific domain.
The paper tackles the problem of building interpretable models for property markets using machine learning on noisy real-world data from the Primorye region, Russia, showing that combining linear regression with geostatistics for land parcels and RuleFit for flats yields effective models despite interpretability constraints.
In this article, we review modern approaches to building interpretable models of property markets using machine learning on the base of mass valuation of property in the Primorye region, Russia. The researcher, lacking expertise in this topic, encounters numerous difficulties in the effort to build a good model. The main source of this is the huge difference between noisy real market data and ideal data which is very common in all types of tutorials on machine learning. This paper covers all stages of modeling: the collection of initial data, identification of outliers, the search and analysis of patterns in the data, the formation and final choice of price factors, the building of the model, and the evaluation of its efficiency. For each stage, we highlight potential issues and describe sound methods for overcoming emerging difficulties on actual examples. We show that the combination of classical linear regression with interpolation methods of geostatistics allows to build an effective model for land parcels. For flats, when many objects are attributed to one spatial point the application of geostatistical methods is difficult. Therefore we suggest linear regression with automatic generation and selection of additional rules on the base of decision trees, so called the RuleFit method. Thus we show, that despite such a strong restriction as the requirement of interpretability which is important in practical aspects, for example, legal matters, it is still possible to build effective models of real property markets.