HoughToRadon Transform: New Neural Network Layer for Features Improvement in Projection Space
This is an incremental improvement for document segmentation tasks, offering a speed-quality trade-off.
The paper tackles the problem of slow neural networks using Hough Transform for semantic image segmentation by introducing a HoughToRadon Transform layer, which reduces processing time and achieves 97.7% accuracy on the MIDV-500 dataset, outperforming prior methods.
In this paper, we introduce HoughToRadon Transform layer, a novel layer designed to improve the speed of neural networks incorporated with Hough Transform to solve semantic image segmentation problems. By placing it after a Hough Transform layer, "inner" convolutions receive modified feature maps with new beneficial properties, such as a smaller area of processed images and parameter space linearity by angle and shift. These properties were not presented in Hough Transform alone. Furthermore, HoughToRadon Transform layer allows us to adjust the size of intermediate feature maps using two new parameters, thus allowing us to balance the speed and quality of the resulting neural network. Our experiments on the open MIDV-500 dataset show that this new approach leads to time savings in document segmentation tasks and achieves state-of-the-art 97.7% accuracy, outperforming HoughEncoder with larger computational complexity.