CLJun 6, 2023

FinRED: A Dataset for Relation Extraction in Financial Domain

Soumya Sharma, Tapas Nayak, Arusarka Bose, Ajay Kumar Meena, Koustuv Dasgupta, Niloy Ganguly, Pawan Goyal

arXiv:2306.03736v12.915 citationsh-index: 43Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the problem of domain mismatch for researchers and practitioners in financial NLP, though it is incremental as it focuses on dataset creation rather than novel modeling.

The authors tackled the lack of a domain-specific dataset for relation extraction in finance by releasing FinRED, a dataset curated from financial news and earnings call transcripts, and found that state-of-the-art models experienced a significant performance drop on it compared to general datasets.

Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing relations from the finance domain. FinRED has been created by mapping Wikidata triplets using distance supervision method. We manually annotate the test data to ensure proper evaluation. We also experiment with various state-of-the-art relation extraction models on this dataset to create the benchmark. We see a significant drop in their performance on FinRED compared to the general relation extraction datasets which tells that we need better models for financial relation extraction.

View on arXiv PDF Code

Similar