A big data-based geographically weighted regression model for public housing prices: A case study in Singapore
Abstract
In this research, three hedonic pricing models, including an ordinary least squares (OLS) model, a Euclidean distance–based (ED-based) geographically weighted regression (GWR) model, and a travel time–based GWR model supported by a big data set of millions of smartcard transactions, have been developed to investigate the spatial variation of Housing Development Board (HDB) public housing resale prices in Singapore. The results help identify factors that could significantly affect public housing resale prices, including the age and the floor area of the housing units, the distance to the nearest park, the distance to the central business district (CBD), and the distance to the nearest Mass Rapid Transit (MRT) station. The comparison of the three models also explicitly shows that the two GWR models perform much better than the traditional linear hedonic regression model, given the identical variables and data used in the calibration. Furthermore, the travel time–based GWR model has better model fit compared to the ED-based GWR model in the case study. This study demonstrates the potential value of the big data–based GWR model in housing research. It could also be applied to other research fields such as public health and criminal justice. Key Words: big data, GWR, Housing Development Board (HDB), hedonic pricing model, Singapore.