HMM-guided frame querying for bandwidth-constrained video search
This addresses bandwidth efficiency for video search applications, but it is incremental as it builds on existing methods like CNNs and HMMs.
The paper tackles the problem of searching for frames of interest in remote video under bandwidth constraints by using a convolutional neural network and a hidden Markov model to propagate predictions, achieving 98% reduction in frame requests without compromising accuracy on the ImageNet-VID dataset.
We design an agent to search for frames of interest in video stored on a remote server, under bandwidth constraints. Using a convolutional neural network to score individual frames and a hidden Markov model to propagate predictions across frames, our agent accurately identifies temporal regions of interest based on sparse, strategically sampled frames. On a subset of the ImageNet-VID dataset, we demonstrate that using a hidden Markov model to interpolate between frame scores allows requests of 98% of frames to be omitted, without compromising frame-of-interest classification accuracy.