CLLGJul 2, 2021

R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

arXiv:2107.00967v2714 citations
AI Analysis

This addresses the need for interpretable hierarchical language models in NLP, offering a novel approach to capture multi-granularity abstraction.

The paper tackles the problem that existing deep models lack explicit hierarchical modeling of language by proposing a recursive Transformer based on differentiable binary trees to emulate composition processes, achieving effectiveness in language modeling and unsupervised parsing tasks.

Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. This paper proposes a recursive Transformer model based on differentiable CKY style binary trees to emulate the composition process. We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps. Experimental results on language modeling and unsupervised parsing show the effectiveness of our approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes