CLMay 15, 2019

BERT Rediscovers the Classical NLP Pipeline

arXiv:1905.05950v21886 citations
Originality Synthesis-oriented
AI Analysis

This provides insights into BERT's internal workings for NLP researchers, but is incremental as it analyzes an existing model.

The study quantified where linguistic information is captured in BERT, finding that it represents the traditional NLP pipeline steps in an interpretable sequence, with the model dynamically adjusting lower-level decisions based on higher-level information.

Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the network. We find that the model represents the steps of the traditional NLP pipeline in an interpretable and localizable way, and that the regions responsible for each step appear in the expected sequence: POS tagging, parsing, NER, semantic roles, then coreference. Qualitative analysis reveals that the model can and often does adjust this pipeline dynamically, revising lower-level decisions on the basis of disambiguating information from higher-level representations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes