CVSDJan 10, 2016

Joint Object-Material Category Segmentation from Audio-Visual Cues

arXiv:1601.02220v118 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of distinguishing visually similar objects made of different materials, which is important for applications in robotics and scene understanding, though it appears incremental as it builds on existing multi-output labeling frameworks.

The paper tackles the problem of recognizing objects and inferring material properties in scenes by augmenting dense visual cues with sparse auditory cues to estimate dense object and material labels, demonstrating that joint estimation significantly outperforms isolated category estimation.

It is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials. In this paper, we therefore present an approach that augments the available dense visual cues with sparse auditory cues in order to estimate dense object and material labels. Since estimates of object class and material properties are mutually informative, we optimise our multi-output labelling jointly using a random-field framework. We evaluate our system on a new dataset with paired visual and auditory data that we make publicly available. We demonstrate that this joint estimation of object and material labels significantly outperforms the estimation of either category in isolation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes