The Shape of Consumer Behavior: A Symbolic and Topological Analysis of Time Series
This work addresses the challenge of clustering noisy time-series data for real-time marketing and trend forecasting, offering practical guidance for consumer analytics, but it is incremental as it compares existing methods on a specific dataset.
This study tackled the problem of clustering high-dimensional, noisy Google Trends time-series data for consumer behavior analysis by evaluating three unsupervised methods: SAX, eSAX, and TDA. The result showed that TDA, using persistent homology, achieved more balanced and meaningful groupings compared to SAX and eSAX, which struggled with volatility and complexity.
Understanding temporal patterns in online search behavior is crucial for real-time marketing and trend forecasting. Google Trends offers a rich proxy for public interest, yet the high dimensionality and noise of its time-series data present challenges for effective clustering. This study evaluates three unsupervised clustering approaches, Symbolic Aggregate approXimation (SAX), enhanced SAX (eSAX), and Topological Data Analysis (TDA), applied to 20 Google Trends keywords representing major consumer categories. Our results show that while SAX and eSAX offer fast and interpretable clustering for stable time series, they struggle with volatility and complexity, often producing ambiguous ``catch-all'' clusters. TDA, by contrast, captures global structural features through persistent homology and achieves more balanced and meaningful groupings. We conclude with practical guidance for using symbolic and topological methods in consumer analytics and suggest that hybrid approaches combining both perspectives hold strong potential for future applications.