LGMLDec 9, 2019

Temporal Factorization of 3D Convolutional Kernels

arXiv:1912.04075v11 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient 3D CNN training for video analysis, offering a method that reduces data and parameter requirements, though it is incremental as it builds on existing factorization ideas.

The paper tackles the problem of training 3D convolutional neural networks, which are parameter-expensive and data-hungry, by proposing a temporal factorization technique that reduces parameters and improves efficiency, achieving significant outperformance in low data regimes and competitive results with up to 45% fewer parameters in high data regimes.

3D convolutional neural networks are difficult to train because they are parameter-expensive and data-hungry. To solve these problems we propose a simple technique for learning 3D convolutional kernels efficiently requiring less training data. We achieve this by factorizing the 3D kernel along the temporal dimension, reducing the number of parameters and making training from data more efficient. Additionally we introduce a novel dataset called Video-MNIST to demonstrate the performance of our method. Our method significantly outperforms the conventional 3D convolution in the low data regime (1 to 5 videos per class). Finally, our model achieves competitive results in the high data regime (>10 videos per class) using up to 45% fewer parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes