Deep Predictive Coding Networks
This work addresses the limitation of static priors in representation learning for time-varying signals like video, offering a novel approach for improved feature extraction in visual tasks.
The authors tackled the problem of fixed priors in deep learning representations by proposing deep predictive coding networks, a hierarchical generative model that dynamically adjusts priors to capture temporal dependencies and use top-down information, resulting in learning high-level visual features from natural video data and demonstrating robustness to structured noise.
The quality of data representation in deep learning methods is directly related to the prior model imposed on the representations; however, generally used fixed priors are not capable of adjusting to the context in the data. To address this issue, we propose deep predictive coding networks, a hierarchical generative model that empirically alters priors on the latent representations in a dynamic and context-sensitive manner. This model captures the temporal dependencies in time-varying signals and uses top-down information to modulate the representation in lower layers. The centerpiece of our model is a novel procedure to infer sparse states of a dynamic model which is used for feature extraction. We also extend this feature extraction block to introduce a pooling function that captures locally invariant representations. When applied on a natural video data, we show that our method is able to learn high-level visual features. We also demonstrate the role of the top-down connections by showing the robustness of the proposed model to structured noise.