Predicting Daily Activities From Egocentric Images Using Deep Learning
This work addresses activity recognition for individuals using wearable cameras, but it is incremental as it builds on existing deep learning techniques with a new fusion method.
The authors tackled the problem of predicting daily activities from egocentric images by using a deep learning method with contextual information, achieving an overall accuracy of 83.07% across 19 activity classes.
We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person's activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.