A Lightweight Multi-Module Fusion Approach for Korean Character Recognition
This addresses robust and efficient OCR for real-time and edge-based systems, but it is incremental as it builds on existing attention and fusion methods.
The paper tackled the problem of underperformance in real-world OCR due to irregular layouts, poor quality, and high costs by introducing SDA-Net, a lightweight architecture for Korean character recognition, achieving state-of-the-art accuracy with faster inference.
Optical Character Recognition (OCR) is essential in applications such as document processing, license plate recognition, and intelligent surveillance. However, existing OCR models often underperform in real-world scenarios due to irregular text layouts, poor image quality, character variability, and high computational costs. This paper introduces SDA-Net (Stroke-Sensitive Attention and Dynamic Context Encoding Network), a lightweight and efficient architecture designed for robust single-character recognition. SDA-Net incorporates: (1) a Dual Attention Mechanism to enhance stroke-level and spatial feature extraction; (2) a Dynamic Context Encoding module that adaptively refines semantic information using a learnable gating mechanism; (3) a U-Net-inspired Feature Fusion Strategy for combining low-level and high-level features; and (4) a highly optimized lightweight backbone that reduces memory and computational demands. Experimental results show that SDA-Net achieves state-of-the-art accuracy on challenging OCR benchmarks, with significantly faster inference, making it well-suited for deployment in real-time and edge-based OCR systems.