CLJun 6, 2023

FinRED: A Dataset for Relation Extraction in Financial Domain

arXiv:2306.03736v115 citationsh-index: 43Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the problem of domain mismatch for researchers and practitioners in financial NLP, though it is incremental as it focuses on dataset creation rather than novel modeling.

The authors tackled the lack of a domain-specific dataset for relation extraction in finance by releasing FinRED, a dataset curated from financial news and earnings call transcripts, and found that state-of-the-art models experienced a significant performance drop on it compared to general datasets.

Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing relations from the finance domain. FinRED has been created by mapping Wikidata triplets using distance supervision method. We manually annotate the test data to ensure proper evaluation. We also experiment with various state-of-the-art relation extraction models on this dataset to create the benchmark. We see a significant drop in their performance on FinRED compared to the general relation extraction datasets which tells that we need better models for financial relation extraction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes