利用機器學習和隨機過程預測房價:以台灣房市為例Predicting Housing Prices via Machine Learning and Stochastic Processes: A Case Study of the Taiwan Property Market
本文介紹兩個預測房價的方式:長短期記憶模型,以及隨機森林結合隨機微分方程。使用隨機微分方程的主要原因,是為了補抓近期的房市交易人情緒是否過熱或過冷,進而影響近期未來房市,並依此設計相關的指標。長短期記憶模型的預測方式,主要利用具有時間序列性質的資料,直接進行預測;而隨機森林則結合隨機微分方程,以上述指標進行分類。利用台灣房市的實證資料,本文發現兩個預測方式,皆有不錯的表現;而隨機森林結合隨機微分方程的方式,在本文測試方式下優於長短期記憶模型,此代表房市近期情緒對於預測房價可能具有一定效果。
關鍵詞:房價、長短期記憶、隨機森林、情緒、隨機微分方程
This study introduces two methods for predicting house prices: Long Short-Term Memory model and Random Forest combined with stochastic differential equations. The main reason for using stochastic differential equations is to capture the sentiments of market participants in the short term and subsequently affect the property market in near future. This enables to determine whether recent property market is overly bullish or bearish, and to design relevant indicators based on this information. The prediction method of the long short-term memory model mainly utilises time-series data for direct predictions, while the random forest combines with stochastic differential equations to perform classification using the aforementioned indicators. Using empirical data from the Taiwan property market, it is found that both prediction methods perform well. However, the random forest combined with stochastic differential equations outperforms the long short-term memory model in the testing conducted in this article, which demonstrates that the long and short-term sentiments of the market may have a certain effect on predicting housing prices.
Key words: Housing Price, Long Short-Term Memory, Random Forest Sentiment, Stochastic Differential Equation