CVMar 11, 2019

Manifold Mixup improves text recognition with CTC loss

arXiv:1903.04246v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses data scarcity in text recognition, but it is incremental as it adapts an existing augmentation method to a specific cost function.

The paper tackled the problem of limited annotated data in handwritten text recognition by applying Manifold Mixup with CTC loss, resulting in improved recognition results across multiple languages and datasets.

Modern handwritten text recognition techniques employ deep recurrent neural networks. The use of these techniques is especially efficient when a large amount of annotated data is available for parameter estimation. Data augmentation can be used to enhance the performance of the systems when data is scarce. Manifold Mixup is a modern method of data augmentation that meld two images or the feature maps corresponding to these images and the targets are fused accordingly. We propose to apply the Manifold Mixup to text recognition while adapting it to work with a Connectionist Temporal Classification cost. We show that Manifold Mixup improves text recognition results on various languages and datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes