Choosing News Topics to Explain Stock Market Returns
This work addresses the challenge of improving topic modeling for financial analysis, but it is incremental as it focuses on comparing existing methods rather than introducing a fundamentally new approach.
The study tackled the problem of selecting news topics to explain stock market returns, finding that supervised Latent Dirichlet Allocation (sLDA) often overfits, while a random search of plain LDA models and a branching procedure for reinforcing effective topic assignments performed better, with the branching procedure achieving the best out-of-sample performance on a dataset of over 90,000 news articles about S&P 500 firms.
We analyze methods for selecting topics in news articles to explain stock returns. We find, through empirical and theoretical results, that supervised Latent Dirichlet Allocation (sLDA) implemented through Gibbs sampling in a stochastic EM algorithm will often overfit returns to the detriment of the topic model. We obtain better out-of-sample performance through a random search of plain LDA models. A branching procedure that reinforces effective topic assignments often performs best. We test methods on an archive of over 90,000 news articles about S&P 500 firms.