CV AI LGApr 20, 2021

VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

Pankaj Mishra, Riccardo Verk, Daniele Fornasier, Claudio Piciarelli, Gian Luca Foresti

arXiv:2104.10036v133.1523 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses anomaly detection in industrial images, though it appears incremental as it builds on existing transformer and reconstruction approaches.

The authors tackled image anomaly detection and localization by proposing VT-ADL, a vision transformer network that combines reconstruction-based methods with patch embedding, achieving results compared to state-of-the-art algorithms on datasets like MNIST and MVTec.

We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps to preserve the spatial information of the embedded patches, which are later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.

View on arXiv PDF Code

Similar