CVApr 25, 2022

OCFormer: One-Class Transformer Network for Image Classification

arXiv:2204.11449v12 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work addresses the problem of one-class classification for image data, which is incremental as it adapts Vision Transformers to an existing task with specific gains.

The authors tackled one-class image classification by proposing OCFormer, a Vision Transformer framework that uses zero-centered Gaussian noise as a pseudo-negative class and an optimal loss function, achieving significant improvements over CNN-based methods on datasets like CIFAR-10, CIFAR-100, Fashion-MNIST, and CelebA eyeglasses.

We propose a novel deep learning framework based on Vision Transformers (ViT) for one-class classification. The core idea is to use zero-centered Gaussian noise as a pseudo-negative class for latent space representation and then train the network using the optimal loss function. In prior works, there have been tremendous efforts to learn a good representation using varieties of loss functions, which ensures both discriminative and compact properties. The proposed one-class Vision Transformer (OCFormer) is exhaustively experimented on CIFAR-10, CIFAR-100, Fashion-MNIST and CelebA eyeglasses datasets. Our method has shown significant improvements over competing CNN based one-class classifier approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes