Real Estate Dictionaries Across Space and Time
Tóm tắt
Leveraging high-dimensional variable selection methods, we show the textual information provided in real estate agents’ remarks about a property can be used to address spatial and temporal heterogeneity in housing markets. Including the textual information in the pricing model decreases in-sample prediction errors by as much as 18.7% at the MSA-level and 39.1% at the zip code level. These results are robust to transforming the raw text using a real estate specific word list, the choice of n-grams, word stemming, and heteroscedasticity in the hedonic and repeat-sales models. These findings suggest the raw text in the remarks can be included directly in predictive pricing models.