CLAIJan 18, 2021

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

arXiv:2101.06829v2809 citations
Originality Incremental advance
AI Analysis

This work addresses calibration issues for users of NLU models, but it is incremental as it builds on existing energy-based methods and noise contrastive estimation.

The paper tackles the problem of improving calibration in natural language understanding models by proposing joint energy-based model training during finetuning of pretrained text encoders, resulting in competitive calibration with little or no accuracy loss.

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes