CLAIJan 15, 2025

MAGNET: Augmenting Generative Decoders with Representation Learning and Infilling Capabilities

arXiv:2501.08648v25 citationsh-index: 16ACL
AI Analysis

This addresses the need for more versatile language models that combine generation and representation learning, offering incremental improvements for natural language processing applications.

The paper tackles the problem of adapting decoder-only large language models (LLMs) for bidirectional modeling by proposing MAGNET, which enables robust representation learning and text infilling, resulting in performance surpassing strong text encoders on token-level and sentence-level tasks and generating contextually appropriate infills.

While originally designed for unidirectional generative modeling, decoder-only large language models (LLMs) are increasingly being adapted for bidirectional modeling. However, unidirectional and bidirectional models are typically trained separately with distinct objectives (generation and representation learning). This separation overlooks the opportunity for developing a more versatile language model and for these objectives to complement each other. In this work, we propose MAGNET, a method for adapting decoder-only LLMs to generate robust representations and infill missing text spans. MAGNET employs three self-supervised training objectives and introduces an attention mechanism that combines bidirectional and causal attention, enabling unified training across all objectives. Our results demonstrate that LLMs adapted with MAGNET (1) surpass strong text encoders on token-level and sentence-level representation learning tasks, (2) generate contextually appropriate text infills by leveraging past and future contexts, (3) perform open-ended text generation without excessive repetition of words or phrases, and (4) preserve the knowledge and reasoning capability gained by the LLM during pretraining.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes