Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos
This work addresses the problem of video understanding for wearable camera users, but it is incremental as it builds on existing methods for contextual analysis.
The paper tackled the challenge of automatically understanding egocentric videos by addressing changing light conditions and unrestricted locations, proposing an unsupervised strategy based on global features and manifold learning. Results demonstrated that non-linear manifold methods can capture contextual patterns efficiently, and the strategy was applied to improve hand-detection in egocentric videos.
Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic understanding of these videos is not an easy task, and its mobile nature implies important challenges to be faced, such as the changing light conditions and the unrestricted locations recorded. This paper proposes an unsupervised strategy based on global features and manifold learning to endow wearable cameras with contextual information regarding the light conditions and the location captured. Results show that non-linear manifold methods can capture contextual patterns from global features without compromising large computational resources. The proposed strategy is used, as an application case, as a switching mechanism to improve the hand-detection problem in egocentric videos.