CLSep 22, 2016

Generating Abstractive Summaries from Meeting Transcripts

Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama

arXiv:1609.07033v118 citations

Originality Incremental advance

AI Analysis

This work addresses the need for concise, readable meeting summaries for professionals and researchers, though it is incremental as it builds on existing abstractive summarization techniques.

The paper tackles the problem of generating readable summaries from meeting transcripts by proposing an abstractive method that fuses important content from multiple utterances, using topic segmentation, supervised learning for utterance importance, and integer linear programming to create well-formed sentences. Experimental results indicate the method produces more informative and readable summaries than baselines, with human judges and parser metrics confirming significant improvements in readability and structure.

Summaries of meetings are very important as they convey the essential content of discussions in a concise form. Generally, it is time consuming to read and understand the whole documents. Therefore, summaries play an important role as the readers are interested in only the important context of discussions. In this work, we address the task of meeting document summarization. Automatic summarization systems on meeting conversations developed so far have been primarily extractive, resulting in unacceptable summaries that are hard to read. The extracted utterances contain disfluencies that affect the quality of the extractive summaries. To make summaries much more readable, we propose an approach to generating abstractive summaries by fusing important content from several utterances. We first separate meeting transcripts into various topic segments, and then identify the important utterances in each segment using a supervised learning approach. The important utterances are then combined together to generate a one-sentence summary. In the text generation step, the dependency parses of the utterances in each segment are combined together to create a directed graph. The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment. The ILP formulation reduces disfluencies by leveraging grammatical relations that are more prominent in non-conversational style of text, and therefore generates summaries that is comparable to human-written abstractive summaries. Experimental results show that our method can generate more informative summaries than the baselines. In addition, readability assessments by human judges as well as log-likelihood estimates obtained from the dependency parser show that our generated summaries are significantly readable and well-formed.

View on arXiv PDF

Similar