ScreenSeg: On-Device Screenshot Layout Analysis
This provides an on-device solution for screenshot analysis, enabling applications like smart editing and content extraction, but it is incremental as it builds on existing layout analysis methods with optimizations for mobile deployment.
The paper tackles the problem of hierarchical layout analysis for screenshots on resource-constrained devices, achieving an average precision of about 0.95 with a latency of around 200ms on a Samsung Galaxy S10 for 1080p screenshots.
We propose a novel end-to-end solution that performs a Hierarchical Layout Analysis of screenshots and document images on resource constrained devices like mobilephones. Our approach segments entities like Grid, Image, Text and Icon blocks occurring in a screenshot. We provide an option for smart editing by auto highlighting these entities for saving or sharing. Further this multi-level layout analysis of screenshots has many use cases including content extraction, keyword-based image search, style transfer, etc. We have addressed the limitations of known baseline approaches, supported a wide variety of semantically complex screenshots, and developed an approach which is highly optimized for on-device deployment. In addition, we present a novel weighted NMS technique for filtering object proposals. We achieve an average precision of about 0.95 with a latency of around 200ms on Samsung Galaxy S10 Device for a screenshot of 1080p resolution. The solution pipeline is already commercialized in Samsung Device applications i.e. Samsung Capture, Smart Crop, My Filter in Camera Application, Bixby Touch.