LGDIS-NNAug 28, 2024

Autoregressive model path dependence near Ising criticality

arXiv:2408.15715v13 citationsh-index: 47
Originality Incremental advance
AI Analysis

This addresses a practical issue for researchers using autoregressive models in physics, though it is incremental as it focuses on a specific case without broad generalization.

The paper tackles the problem of choosing optimal autoregressive sequence orderings for language models applied to non-1D physical systems, specifically reconstructing critical correlations in the 2D Ising model, and finds that paths with long 1D segments train more efficiently than space-filling curves.

Autoregressive models are a class of generative model that probabilistically predict the next output of a sequence based on previous inputs. The autoregressive sequence is by definition one-dimensional (1D), which is natural for language tasks and hence an important component of modern architectures like recurrent neural networks (RNNs) and transformers. However, when language models are used to predict outputs on physical systems that are not intrinsically 1D, the question arises of which choice of autoregressive sequence -- if any -- is optimal. In this paper, we study the reconstruction of critical correlations in the two-dimensional (2D) Ising model, using RNNs and transformers trained on binary spin data obtained near the thermal phase transition. We compare the training performance for a number of different 1D autoregressive sequences imposed on finite-size 2D lattices. We find that paths with long 1D segments are more efficient at training the autoregressive models compared to space-filling curves that better preserve the 2D locality. Our results illustrate the potential importance in choosing the optimal autoregressive sequence ordering when training modern language models for tasks in physics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes