RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking
This work addresses the need for robust, real-time eye segmentation to improve gaze estimation for interactive computing applications, representing a domain-specific incremental advancement.
The paper tackles the problem of real-time semantic segmentation of the eye for gaze tracking by introducing RITnet, a deep neural network that combines U-Net and DenseNet, achieving 95.3% accuracy on the OpenEDS challenge and tracking at over 300Hz on a GTX 1080 Ti.
Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness, and an inability to be run in real-time. Here, we present the RITnet model, which is a deep neural network that combines U-Net and DenseNet. RITnet is under 1 MB and achieves 95.3\% accuracy on the 2019 OpenEDS Semantic Segmentation challenge. Using a GeForce GTX 1080 Ti, RITnet tracks at $>$ 300Hz, enabling real-time gaze tracking applications. Pre-trained models and source code are available https://bitbucket.org/eye-ush/ritnet/.