LGCVMay 17, 2016

Multimodal Sparse Coding for Event Detection

arXiv:1605.05212v16 citations
Originality Incremental advance
AI Analysis

This work addresses multimedia event detection for applications like video analysis, but it is incremental as it extends existing sparse coding techniques to multimodal settings.

The paper tackled the problem of multimedia event detection by proposing multimodal sparse coding to learn shared feature representations across multiple modalities, achieving improved classification accuracy and mean average precision compared to unimodal methods and other feature learning approaches on the TRECVID MED 2014 dataset.

Unsupervised feature learning methods have proven effective for classification tasks based on a single modality. We present multimodal sparse coding for learning feature representations shared across multiple modalities. The shared representations are applied to multimedia event detection (MED) and evaluated in comparison to unimodal counterparts, as well as other feature learning methods such as GMM supervectors and sparse RBM. We report the cross-validated classification accuracy and mean average precision of the MED system trained on features learned from our unimodal and multimodal settings for a subset of the TRECVID MED 2014 dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes