CVAIMar 26, 2023

A Contrastive Learning Scheme with Transformer Innate Patches

arXiv:2303.14806v2h-index: 8
Originality Incremental advance
AI Analysis

This work addresses the problem of improving semantic segmentation performance for aerial image analysis, which suffers from low-resolution data and class imbalance, though it is incremental as it adapts existing contrastive learning techniques to a new task.

The paper tackles the challenge of applying contrastive learning to dense prediction tasks like semantic segmentation by introducing Contrastive Transformer, a scheme that uses Transformer innate patches for supervised patch-level contrastive learning, resulting in a consistent increase in mean IoU across all classes on the ISPRS Potsdam dataset.

This paper presents Contrastive Transformer, a contrastive learning scheme using the Transformer innate patches. Contrastive Transformer enables existing contrastive learning techniques, often used for image classification, to benefit dense downstream prediction tasks such as semantic segmentation. The scheme performs supervised patch-level contrastive learning, selecting the patches based on the ground truth mask, subsequently used for hard-negative and hard-positive sampling. The scheme applies to all vision-transformer architectures, is easy to implement, and introduces minimal additional memory footprint. Additionally, the scheme removes the need for huge batch sizes, as each patch is treated as an image. We apply and test Contrastive Transformer for the case of aerial image segmentation, known for low-resolution data, large class imbalance, and similar semantic classes. We perform extensive experiments to show the efficacy of the Contrastive Transformer scheme on the ISPRS Potsdam aerial image segmentation dataset. Additionally, we show the generalizability of our scheme by applying it to multiple inherently different Transformer architectures. Ultimately, the results show a consistent increase in mean IoU across all classes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes