CLLGNEFeb 15, 2017

Training Language Models Using Target-Propagation

arXiv:1702.04770v19 citations
Originality Synthesis-oriented
AI Analysis

This addresses training inefficiencies for researchers in machine learning, but is incremental as it evaluates an existing method without achieving improvements.

The paper tackled the issues of sequentiality and gradient truncation in training RNNs with Truncated Back-Propagation through Time by exploring Target Propagation as an alternative, but found it generally underperforms BPTT in experiments.

While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps. We investigate whether Target Propagation (TPROP) style approaches can address these shortcomings. Unfortunately, extensive experiments suggest that TPROP generally underperforms BPTT, and we end with an analysis of this phenomenon, and suggestions for future work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes