CLOct 22, 2022

ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts

arXiv:2210.12467v2305 citationsh-index: 43
Originality Synthesis-oriented
AI Analysis

This addresses the problem of summarizing financial documents for researchers and practitioners, but it is incremental as it focuses on creating a new dataset and a straightforward method rather than a major algorithmic breakthrough.

The authors tackled the lack of datasets for summarizing long, unstructured financial documents by introducing ECTSum, a benchmark dataset of earnings call transcripts with expert-written bullet point summaries, and they presented a simple-yet-effective method, ECT-BPS, to generate bullet points that capture important facts from the calls.

Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports. Efficient techniques to summarize financial documents, including facts and figures, have largely been unexplored, majorly due to the unavailability of suitable datasets. In this work, we present ECTSum, a new dataset with transcripts of earnings calls (ECTs), hosted by publicly traded companies, as documents, and short experts-written telegram-style bullet point summaries derived from corresponding Reuters articles. ECTs are long unstructured documents without any prescribed length limit or format. We benchmark our dataset with state-of-the-art summarizers across various metrics evaluating the content quality and factual consistency of the generated summaries. Finally, we present a simple-yet-effective approach, ECT-BPS, to generate a set of bullet points that precisely capture the important facts discussed in the calls.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes