CVAILGJan 13, 2022

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

arXiv:2201.04866v1
AI Analysis

This addresses the annotation cost issue for scene text detection researchers, but it is incremental as it builds on existing supervised RL approaches.

The paper tackles the problem of expensive data annotation in scene text detection by proposing a weakly supervised method using reinforcement learning, where the reward is estimated by a neural network, and finds that semi-supervised training with labeled synthetic and unannotated real-world data yields the best results.

The challenging field of scene text detection requires complex data annotation, which is time-consuming and expensive. Techniques, such as weak supervision, can reduce the amount of data needed. In this paper we propose a weak supervision method for scene text detection, which makes use of reinforcement learning (RL). The reward received by the RL agent is estimated by a neural network, instead of being inferred from ground-truth labels. First, we enhance an existing supervised RL approach to text detection with several training optimizations, allowing us to close the performance gap to regression-based algorithms. We then use our proposed system in a weakly- and semi-supervised training on real-world data. Our results show that training in a weakly supervised setting is feasible. However, we find that using our model in a semi-supervised setting , e.g. when combining labeled synthetic data with unannotated real-world data, produces the best results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes