How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions
This addresses the challenge of making numerical information more accessible and interpretable for general readers, though it is incremental as it builds on existing methods with specific enhancements.
The paper tackles the problem of helping readers understand large numbers by automatically generating contextual descriptions, achieving a 15.2% F1 improvement in formula construction and a 12.5 BLEU point improvement in description generation over baselines.
How much is 131 million US dollars? To help readers put such numbers in context, we propose a new task of automatically generating short descriptions known as perspectives, e.g. "$131 million is about the cost to employ everyone in Texas over a lunch period". First, we collect a dataset of numeric mentions in news articles, where each mention is labeled with a set of rated perspectives. We then propose a system to generate these descriptions consisting of two steps: formula construction and description generation. In construction, we compose formulae from numeric facts in a knowledge base and rank the resulting formulas based on familiarity, numeric proximity and semantic compatibility. In generation, we convert a formula into natural language using a sequence-to-sequence recurrent neural network. Our system obtains a 15.2% F1 improvement over a non-compositional baseline at formula construction and a 12.5 BLEU point improvement over a baseline description generation.