CLLGDec 16, 2021

Masked Measurement Prediction: Learning to Jointly Predict Quantities and Units from Textual Context

arXiv:2112.08616v1628 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of numerical reasoning in AI for applications like academic papers and web tables, but it is incremental as it builds on existing masked language modeling approaches.

The paper tackles the problem of evaluating and improving numeracy in language models for physical measurements by introducing the Masked Measurement Prediction (MMP) task, where a model reconstructs numbers and units from masked text, and shows that their proposed GeMM model outperforms traditional pretrained transformers like RoBERTa, which significantly underperform on this task.

Physical measurements constitute a large portion of numbers in academic papers, engineering reports, and web tables. Current benchmarks fall short of properly evaluating numeracy of pretrained language models on measurements, hindering research on developing new methods and applying them to numerical tasks. To that end, we introduce a novel task, Masked Measurement Prediction (MMP), where a model learns to reconstruct a number together with its associated unit given masked text. MMP is useful for both training new numerically informed models as well as evaluating numeracy of existing systems. In order to address this task, we introduce a new Generative Masked Measurement (GeMM) model that jointly learns to predict numbers along with their units. We perform fine-grained analyses comparing our model with various ablations and baselines. We use linear probing of traditional pretrained transformer models (RoBERTa) to show that they significantly underperform jointly trained number-unit models, highlighting the difficulty of this new task and the benefits of our proposed pretraining approach. We hope this framework accelerates the progress towards building more robust numerical reasoning systems in the future.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes