Learning to Compose Words into Sentences with Reinforcement Learning
This work addresses the challenge of improving natural language processing models by automatically discovering effective composition structures, though it is incremental as it builds on prior tree-structured models.
The paper tackles the problem of learning task-specific tree structures for sentence representation without relying on explicit syntactic annotations, using reinforcement learning to optimize composition orders for downstream tasks. The method outperforms both sequential encoders and supervised recursive encoders in experiments.
We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences. In contrast with prior work on tree-structured models in which the trees are either provided as input or predicted using supervision from explicit treebank annotations, the tree structures in this work are optimized to improve performance on a downstream task. Experiments demonstrate the benefit of learning task-specific composition orders, outperforming both sequential encoders and recursive encoders based on treebank annotations. We analyze the induced trees and show that while they discover some linguistically intuitive structures (e.g., noun phrases, simple verb phrases), they are different than conventional English syntactic structures.