CLJan 5, 2023

Unsupervised Broadcast News Summarization; a comparative study on Maximal Marginal Relevance (MMR) and Latent Semantic Analysis (LSA)

arXiv:2301.02284v16 citationsh-index: 25
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of automatic summarization for Persian broadcast news, but it is incremental as it applies existing methods to a new dataset without introducing novel techniques.

The study compared two unsupervised methods, Latent Semantic Analysis (LSA) and Maximal Marginal Relevance (MMR), for summarizing Persian broadcast news transcripts, finding that LSA performed better in generic summarization while MMR was superior in query-based summarization.

The methods of automatic speech summarization are classified into two groups: supervised and unsupervised methods. Supervised methods are based on a set of features, while unsupervised methods perform summarization based on a set of rules. Latent Semantic Analysis (LSA) and Maximal Marginal Relevance (MMR) are considered the most important and well-known unsupervised methods in automatic speech summarization. This study set out to investigate the performance of two aforementioned unsupervised methods in transcriptions of Persian broadcast news summarization. The results show that in generic summarization, LSA outperforms MMR, and in query-based summarization, MMR outperforms LSA in broadcast news summarization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes