CLAIIRJan 12, 2015

Autodetection and Classification of Hidden Cultural City Districts from Yelp Reviews

arXiv:1501.02527v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of identifying and visualizing cultural areas in cities for urban planners or tourists, but it is incremental as it applies existing methods to a new dataset.

The study used topic modeling and clustering methods on Yelp reviews to classify restaurants and uncover hidden cultural districts in cities, resulting in a map display and a topic similarity heatmap for new restaurants.

Topic models are a way to discover underlying themes in an otherwise unstructured collection of documents. In this study, we specifically used the Latent Dirichlet Allocation (LDA) topic model on a dataset of Yelp reviews to classify restaurants based off of their reviews. Furthermore, we hypothesize that within a city, restaurants can be grouped into similar "clusters" based on both location and similarity. We used several different clustering methods, including K-means Clustering and a Probabilistic Mixture Model, in order to uncover and classify districts, both well-known and hidden (i.e. cultural areas like Chinatown or hearsay like "the best street for Italian restaurants") within a city. We use these models to display and label different clusters on a map. We also introduce a topic similarity heatmap that displays the similarity distribution in a city to a new restaurant.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes