Fine-tuning Vision Transformers for the Prediction of State Variables in Ising Models

arXiv:2109.13925v24 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem in physics simulations, offering incremental improvements by adapting an existing method to new data.

The authors tackled the problem of predicting state variables in 2D Ising models by applying a Vision Transformer (ViT), which outperformed state-of-the-art Convolutional Neural Networks (CNNs) when using a small number of microstate images across various boundary conditions and temperatures.

Transformers are state-of-the-art deep learning models that are composed of stacked attention and point-wise, fully connected layers designed for handling sequential data. Transformers are not only ubiquitous throughout Natural Language Processing (NLP), but, recently, they have inspired a new wave of Computer Vision (CV) applications research. In this work, a Vision Transformer (ViT) is applied to predict the state variables of 2-dimensional Ising model simulations. Our experiments show that ViT outperform state-of-the-art Convolutional Neural Networks (CNN) when using a small number of microstate images from the Ising model corresponding to various boundary conditions and temperatures. This work opens the possibility of applying ViT to other simulations, and raises interesting research directions on how attention maps can learn about the underlying physics governing different phenomena.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes