CLMay 22, 2020

A Generative Approach to Titling and Clustering Wikipedia Sections

Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, Abe Ittycheriah

arXiv:2005.11216v131.1996 citations

Originality Synthesis-oriented

AI Analysis

This work addresses information organization in Wikipedia, but it is incremental as it builds on existing transformer methods for a specific task.

The paper tackled the task of generating section headings for Wikipedia articles using transformer encoders with various decoders, finding that decoders with attention mechanisms produce high-scoring extractive text, while those without attention enable semantic encoding for embeddings, and introduced a new loss function to improve embedding quality.

We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic encoding and can be used to generate section embeddings. We additionally introduce a new loss function, which further encourages the decoder to generate high-quality embeddings.

View on arXiv PDF

Similar