CLAILGSep 7, 2021

Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression

arXiv:2109.03228v2675 citations
Originality Incremental advance
AI Analysis

This work addresses the need for better evaluation metrics in model compression for NLP practitioners, though it is incremental as it builds on existing compression methods.

The paper tackled the problem of evaluating compressed pretrained language models like BERT by proposing new metrics (label loyalty and probability loyalty) to measure how well a compressed model mimics the original, and it found that combining multiple compression techniques can improve accuracy, loyalty, and robustness.

Recent studies on compression of pretrained language models (e.g., BERT) usually use preserved accuracy as the metric for evaluation. In this paper, we propose two new metrics, label loyalty and probability loyalty that measure how closely a compressed model (i.e., student) mimics the original model (i.e., teacher). We also explore the effect of compression with regard to robustness under adversarial attacks. We benchmark quantization, pruning, knowledge distillation and progressive module replacing with loyalty and robustness. By combining multiple compression techniques, we provide a practical strategy to achieve better accuracy, loyalty and robustness.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes