Neural Status Registers
This addresses the challenge of enabling neural networks to perform quantitative reasoning tasks, such as comparisons and graph operations, which is incremental as it builds on existing extrapolation methods.
The paper tackles the problem of quantitative reasoning in neural networks, which remains unaddressed despite progress in arithmetic extrapolation, by proposing a novel architectural element called the Neural Status Register (NSR) that enables end-to-end learning and achieves extrapolation to numbers many orders of magnitude larger than those in training.
Standard Neural Networks can learn mathematical operations, but they do not extrapolate. Extrapolation means that the model can apply to larger numbers, well beyond those observed during training. Recent architectures tackle arithmetic operations and can extrapolate; however, the equally important problem of quantitative reasoning remains unaddressed. In this work, we propose a novel architectural element, the Neural Status Register (NSR), for quantitative reasoning over numbers. Our NSR relaxes the discrete bit logic of physical status registers to continuous numbers and allows end-to-end learning with gradient descent. Experiments show that the NSR achieves solutions that extrapolate to numbers many orders of magnitude larger than those in the training set. We successfully train the NSR on number comparisons, piecewise discontinuous functions, counting in sequences, recurrently finding minimums, finding shortest paths in graphs, and comparing digits in images.