CVLGNov 28, 2023

STR-Cert: Robustness Certification for Deep Text Recognition on Deep Learning Pipelines and Vision Transformers

arXiv:2401.05338v11 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses robustness certification for safety-critical applications in scene text recognition, an incremental advancement over existing methods limited to simpler architectures.

The paper tackled robustness certification for scene text recognition (STR) models, including standard pipelines and Vision Transformers, by proposing STR-Cert, which extended the DeepPoly framework with novel bounds and algorithms, and demonstrated efficiency and scalability on six datasets.

Robustness certification, which aims to formally certify the predictions of neural networks against adversarial inputs, has become an integral part of important tool for safety-critical applications. Despite considerable progress, existing certification methods are limited to elementary architectures, such as convolutional networks, recurrent networks and recently Transformers, on benchmark datasets such as MNIST. In this paper, we focus on the robustness certification of scene text recognition (STR), which is a complex and extensively deployed image-based sequence prediction problem. We tackle three types of STR model architectures, including the standard STR pipelines and the Vision Transformer. We propose STR-Cert, the first certification method for STR models, by significantly extending the DeepPoly polyhedral verification framework via deriving novel polyhedral bounds and algorithms for key STR model components. Finally, we certify and compare STR models on six datasets, demonstrating the efficiency and scalability of robustness certification, particularly for the Vision Transformer.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes