CVLGNov 18, 2021

PyTorchVideo: A Deep Learning Library for Video Understanding

arXiv:2111.09887v170 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This library addresses the need for efficient and reproducible tools in video understanding for researchers and practitioners, though it is incremental as it builds on existing frameworks like PyTorch.

The authors introduced PyTorchVideo, an open-source deep learning library for video understanding tasks, providing modular components that achieve state-of-the-art performance and support real-time inference on mobile devices.

We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing. The library covers a full stack of video understanding tools including multimodal data loading, transformations, and models that reproduce state-of-the-art performance. PyTorchVideo further supports hardware acceleration that enables real-time inference on mobile devices. The library is based on PyTorch and can be used by any training framework; for example, PyTorchLightning, PySlowFast, or Classy Vision. PyTorchVideo is available at https://pytorchvideo.org/

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes