SIAICLDec 26, 2020

Social media data reveals signal for public consumer perceptions

arXiv:2012.13675v12 citations
AI Analysis

This work provides a more robust method for economists and market researchers to estimate consumer confidence using social media, potentially reducing the need for frequent and costly surveys.

This paper tackles the problem of reliably estimating the Consumer Confidence Index (CCI) using social media data, which previous methods struggled with when tested on newer data. The authors propose a non-parametric Bayesian modeling framework and demonstrate that their model can reliably estimate both monthly and daily CCI several months in advance using decadal Reddit data (2008-2019), outperforming existing methods.

Researchers have used social media data to estimate various macroeconomic indicators about public behaviors, mostly as a way to reduce surveying costs. One of the most widely cited economic indicator is consumer confidence index (CCI). Numerous studies in the past have focused on using social media, especially Twitter data, to predict CCI. However, the strong correlations disappeared when those models were tested with newer data according to a recent comprehensive survey. In this work, we revisit this problem of assessing the true potential of using social media data to measure CCI, by proposing a robust non-parametric Bayesian modeling framework grounded in Gaussian Process Regression (which provides both an estimate and an uncertainty associated with it). Integral to our framework is a principled experimentation methodology that demonstrates how digital data can be employed to reduce the frequency of surveys, and thus periodic polling would be needed only to calibrate our model. Via extensive experimentation we show how the choice of different micro-decisions, such as the smoothing interval, various types of lags etc. have an important bearing on the results. By using decadal data (2008-2019) from Reddit, we show that both monthly and daily estimates of CCI can, indeed, be reliably estimated at least several months in advance, and that our model estimates are far superior to those generated by the existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes