ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks
This addresses the need for efficient human-in-the-loop training in applications like surveillance or robotics, though it is incremental as it builds on existing CNN and tracking methods.
The paper tackles the problem of training convolutional neural networks in real-time on live video streams with minimal human input, introducing a system that uses optical flow-based object tracking to increase the effectiveness of human actions by about 8 times.
Today's general-purpose deep convolutional neural networks (CNN) for image classification and object detection are trained offline on large static datasets. Some applications, however, will require training in real-time on live video streams with a human-in-the-loop. We refer to this class of problem as Time-ordered Online Training (ToOT) - these problems will require a consideration of not only the quantity of incoming training data, but the human effort required to tag and use it. In this paper, we define training benefit as a metric to measure the effectiveness of a sequence in using each user interaction. We demonstrate and evaluate a system tailored to performing ToOT in the field, capable of training an image classifier on a live video stream through minimal input from a human operator. We show that by exploiting the time-ordered nature of the video stream through optical flow-based object tracking, we can increase the effectiveness of human actions by about 8 times.