BALZARINI MONICA GRACIELA
Congresos y reuniones científicas
Título:
Mass appraisal of urban and rural land values using random forest with spatial restriction
Autor/es:
CÓRDOBA, M.; MONZANI, F. ; CARRANZA J.P.; PIUMETTO, M.; BALZARINI, M.
Reunión:
Congreso; 30th International Biometric Conference; 2020
Resumen:
The advancement of computational software and machine learning practice has facilitated enhanced uptake of mass appraisal methodologies for price modelling and prediction of land value. Since the characteristics of properties are geographically distributed, spatial autocorrelation computing could improve models to explain property prices. Different types of Random Forest models (RF), the classical one and quantile RF (QRF), were recognized as machine learning technique for real estate mass appraisal. However, a major drawback of this method is that they ignore influences of neighboring observed data when predicting the price properties. In order to overcome the disadvantage, random forest plus kriging of residuals (RFKO) method can be used. Initially, a RF of land values using predictive ancillary variables is carried out in order to model the trend component. In the second step, ordinary kriging is applied to the residuals of RF and a spatial prediction of the residuals is created. The final prediction is an additive combination of both model steps. The aim of this study was to compare performances of RF and quantile QRF both with and without spatial restriction in the prediction of rural and urban land values. We use two datasets of 3718 and 264 market data, released between 2017 and 2018. The first contains data of rural land value for the whole Province of Córdoba, Argentina, and the second one involves data coming from a village (Villa María) in the Province of Córdoba. A 10-fold cross validation was used to estimate prediction errors for each model. The root mean square prediction error was expressed as percentage of the mean yield (RMSE). Additionally, we fit an empirical a theoretical semivariogram to characterize the Relative Structured Variability (RSV, ratio of nugget and sill variance) of the residual from the compared methods. The results showed that only in the urban land the methods that incorporate spatial information performed better, RMSE of 30% vs. 34% for RF and 33% vs. 34% for QRF with and without kriging of the residuals, respectively.