Predicting health inspection results from online restaurant reviews
This addresses public health monitoring by automating prediction from user-generated data, but it is incremental as it applies existing methods to a new domain.
The paper tackled predicting official health inspection results for restaurants by analyzing online Yelp reviews using linguistic features, achieving over 90% accuracy with support vector machines.
Informatics around public health are increasingly shifting from the professional to the public spheres. In this work, we apply linguistic analytics to restaurant reviews, from Yelp, in order to automatically predict official health inspection reports. We consider two types of feature sets, i.e., keyword detection and topic model features, and use these in several classification methods. Our empirical analysis shows that these extracted features can predict public health inspection reports with over 90% accuracy using simple support vector machines.