Dynamic Variational Autoencoders for Visual Process Modeling
This work addresses visual process modeling for applications like video analysis, but it appears incremental as it builds on existing methods without major breakthroughs.
The authors tackled the problem of modeling visual processes by developing a joint learning framework that combines vector autoregressive models and Variational Autoencoders to learn linear, Gaussian representations from sequences, and validated it on artificial sequences and dynamic textures.
This work studies the problem of modeling visual processes by leveraging deep generative architectures for learning linear, Gaussian representations from observed sequences. We propose a joint learning framework, combining a vector autoregressive model and Variational Autoencoders. This results in an architecture that allows Variational Autoencoders to simultaneously learn a non-linear observation as well as a linear state model from sequences of frames. We validate our approach on artificial sequences and dynamic textures.