CVJun 11, 2018

Massively Parallel Video Networks

arXiv:1806.03863v244 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency bottlenecks in video processing for applications requiring real-time analysis, though it is incremental as it builds on existing image architectures.

The authors tackled the problem of inefficient video processing by introducing causal video understanding models that maximize throughput and minimize latency through operation pipelining and multi-rate clocks, achieving significant parallelism and speedup with little performance loss on tasks like action recognition and human keypoint localization.

We introduce a class of causal video understanding models that aims to improve efficiency of video processing by maximising throughput, minimising latency, and reducing the number of clock cycles. Leveraging operation pipelining and multi-rate clocks, these models perform a minimal amount of computation (e.g. as few as four convolutional layers) for each frame per timestep to produce an output. The models are still very deep, with dozens of such operations being performed but in a pipelined fashion that enables depth-parallel computation. We illustrate the proposed principles by applying them to existing image architectures and analyse their behaviour on two video tasks: action recognition and human keypoint localisation. The results show that a significant degree of parallelism, and implicitly speedup, can be achieved with little loss in performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes