Deep Siamese Networks with Bayesian non-Parametrics for Video Object Tracking
This addresses the problem of efficient and accurate object tracking in video sequences for computer vision applications, representing an incremental improvement through a novel hybrid approach.
The paper tackles video object tracking by treating it as a dynamic optimization problem, combining a deep Siamese network with Bayesian optimization to encode spatio-temporal information, resulting in statistically principled and efficient tracking that outperforms current state-of-the-art methods.
We present a novel algorithm utilizing a deep Siamese neural network as a general object similarity function in combination with a Bayesian optimization (BO) framework to encode spatio-temporal information for efficient object tracking in video. In particular, we treat the video tracking problem as a dynamic (i.e. temporally-evolving) optimization problem. Using Gaussian Process priors, we model a dynamic objective function representing the location of a tracked object in each frame. By exploiting temporal correlations, the proposed method queries the search space in a statistically principled and efficient way, offering several benefits over current state of the art video tracking methods.