Advancing Sequential Numerical Prediction in Autoregressive Models
This addresses a specific bottleneck in sequence generation for tasks involving numerical data, but it is incremental as it builds on existing autoregressive models.
The paper tackled the problem of autoregressive models treating digits as independent tokens in numerical sequence prediction by introducing Numerical Token Integrity Loss (NTIL), which improved performance through token-level and sequence-level components.
Autoregressive models have become the de facto choice for sequence generation tasks, but standard approaches treat digits as independent tokens and apply cross-entropy loss, overlooking the coherent structure of numerical sequences. This paper introduces Numerical Token Integrity Loss (NTIL) to address this gap. NTIL operates at two levels: (1) token-level, where it extends the Earth Mover's Distance (EMD) to preserve ordinal relationships between numerical values, and (2) sequence-level, where it penalizes the overall discrepancy between the predicted and actual sequences. This dual approach improves numerical prediction and integrates effectively with LLMs/MLLMs. Extensive experiments show significant performance improvements with NTIL.