PreCog: Exploring the Relation between Memorization and Performance in Pre-trained Language Models
This is an incremental contribution analyzing memorization effects in pre-trained language models for NLP researchers.
The paper tackled the relationship between memorization and performance in BERT by proposing PreCog, a measure to evaluate memorization from pre-training, and found that highly memorized examples are better classified, indicating memorization is key to BERT's success.
Pre-trained Language Models such as BERT are impressive machines with the ability to memorize, possibly generalized learning examples. We present here a small, focused contribution to the analysis of the interplay between memorization and performance of BERT in downstream tasks. We propose PreCog, a measure for evaluating memorization from pre-training, and we analyze its correlation with the BERT's performance. Our experiments show that highly memorized examples are better classified, suggesting memorization is an essential key to success for BERT.