Development of an Entropy-Based Feature Selection Method and Analysis of Online Reviews on Real Estate
This work addresses the challenge of processing large-scale online reviews for real estate analysis, but it is incremental as it applies a known entropy method to a specific domain.
The study tackled the problem of analyzing user needs in real estate by developing an entropy-based feature selection method to extract keywords from 6 million posts on a Japanese BBS, achieving a 0.69 F-measure and identifying key concerns such as apartment facilities, access, and price.
In recent years, data posted about real estate on the Internet is currently increasing. In this study, in order to analyze user needs for real estate, we focus on "Mansion Community" which is a Japanese bulletin board system (hereinafter referred to as BBS) about Japanese real estate. In our study, extraction of keywords is performed based on the calculation of the entropy value of each word, and we used them as features in a machine learning classifier to analyze 6 million posts at "Mansion Community". As a result, we achieved a 0.69 F-measure and found that the customers are particularly concerned about the facility of apartment, access, and price of an apartment.