CVAug 29, 2019

Focus-Enhanced Scene Text Recognition with Deformable Convolutions

arXiv:1908.10998v215 citationsHas Code
AI Analysis

This work addresses the problem of irregular text recognition in computer vision, which is incremental as it builds on existing deep learning methods by incorporating deformable convolutions.

The paper tackles the challenge of recognizing irregular scene text with various shapes and distortions by introducing a recognition network that uses deformable convolutional layers to adjust focus without rectification steps, achieving satisfactory performance on public benchmarks.

Recently, scene text recognition methods based on deep learning have sprung up in computer vision area. The existing methods achieved great performances, but the recognition of irregular text is still challenging due to the various shapes and distorted patterns. Consider that at the time of reading words in the real world, normally we will not rectify it in our mind but adjust our focus and visual fields. Similarly, through utilizing deformable convolutional layers whose geometric structures are adjustable, we present an enhanced recognition network without the steps of rectification to deal with irregular text in this work. A number of experiments have been applied, where the results on public benchmarks demonstrate the effectiveness of our proposed components and shows that our method has reached satisfactory performances. The code will be publicly available at https://github.com/Alpaca07/dtr soon.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes