The R package sentometrics to compute, aggregate and predict with textual sentiment
This provides a practical tool for researchers and analysts in finance and social sciences to leverage textual data for predictive modeling, though it is incremental as it builds on existing sentiment analysis methods.
The authors introduced the R package sentometrics for optimized textual sentiment indexation, enabling efficient computation and aggregation of sentiment scores from texts into time series to predict variables, as demonstrated by forecasting the CBOE Volatility Index using a built-in news corpus.
We provide a hands-on introduction to optimized textual sentiment indexation using the R package sentometrics. Textual sentiment analysis is increasingly used to unlock the potential information value of textual data. The sentometrics package implements an intuitive framework to efficiently compute sentiment scores of numerous texts, to aggregate the scores into multiple time series, and to use these time series to predict other variables. The workflow of the package is illustrated with a built-in corpus of news articles from two major U.S. journals to forecast the CBOE Volatility Index.