CVJun 15, 2017

Hierarchical Label Inference for Video Classification

arXiv:1706.05028v22 citations
AI Analysis

This addresses the problem of understanding unconstrained internet videos for researchers and practitioners, but it appears incremental as it builds on existing methods for label hierarchy.

The paper tackled video classification by using a Bidirectional Inference Neural Network (BINN) to leverage hierarchical label structures, achieving significant improvements over baseline models on the YouTube-8M dataset.

Videos are a rich source of high-dimensional structured data, with a wide range of interacting components at varying levels of granularity. In order to improve understanding of unconstrained internet videos, it is important to consider the role of labels at separate levels of abstraction. In this paper, we consider the use of the Bidirectional Inference Neural Network (BINN) for performing graph-based inference in label space for the task of video classification. We take advantage of the inherent hierarchy between labels at increasing granularity. The BINN is evaluated on the first and second release of the YouTube-8M large scale multilabel video dataset. Our results demonstrate the effectiveness of BINN, achieving significant improvements against baseline models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes