A Multi-Source Entity-Level Sentiment Corpus for the Financial Domain: The FinLin Corpus
This provides a new dataset for researchers in financial sentiment analysis and behavioral science, though it is incremental as it adds to existing resources.
The authors tackled the lack of diverse financial sentiment data by introducing FinLin, a novel corpus with investor reports, company reports, news articles, and microblogs annotated for sentiment and relevance, resulting in a publicly available dataset covering multiple entities in the automobile industry over a 3-month period.
We introduce FinLin, a novel corpus containing investor reports, company reports, news articles, and microblogs from StockTwits, targeting multiple entities stemming from the automobile industry and covering a 3-month period. FinLin was annotated with a sentiment score and a relevance score in the range [-1.0, 1.0] and [0.0, 1.0], respectively. The annotations also include the text spans selected for the sentiment, thus, providing additional insight into the annotators' reasoning. Overall, FinLin aims to complement the current knowledge by providing a novel and publicly available financial sentiment corpus and to foster research on the topic of financial sentiment analysis and potential applications in behavioural science.