L^3U-net: Low-Latency Lightweight U-net Based Image Segmentation Model for Parallel CNN Processors
This work addresses the problem of efficient image segmentation for edge computing applications, though it appears incremental as it builds on existing U-net architectures with optimizations for specific hardware.
The researchers tackled real-time image segmentation on low-resource edge devices by proposing L^3U-net, a tiny model that uses data folding to reduce latency, achieving over 90% accuracy on two datasets at 10 fps.
In this research, we propose a tiny image segmentation model, L^3U-net, that works on low-resource edge devices in real-time. We introduce a data folding technique that reduces inference latency by leveraging the parallel convolutional layer processing capability of the CNN accelerators. We also deploy the proposed model to such a device, MAX78000, and the results show that L^3U-net achieves more than 90% accuracy over two different segmentation datasets with 10 fps.