CVNov 5, 2018

Multi-Level Sensor Fusion with Deep Learning

arXiv:1811.02447v12 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient sensor fusion in multimodal AI applications, offering an incremental improvement over existing methods.

The paper tackles the problem of balancing early and late fusion in multimodal deep learning by introducing CentralNet, a deep network that fuses sensor information at multiple abstraction levels, achieving state-of-the-art performance on four datasets and automatically selecting optimal fusion strategies.

In the context of deep learning, this article presents an original deep network, namely CentralNet, for the fusion of information coming from different sensors. This approach is designed to efficiently and automatically balance the trade-off between early and late fusion (i.e. between the fusion of low-level vs high-level information). More specifically, at each level of abstraction-the different levels of deep networks-uni-modal representations of the data are fed to a central neural network which combines them into a common embedding. In addition, a multi-objective regularization is also introduced, helping to both optimize the central network and the unimodal networks. Experiments on four multimodal datasets not only show state-of-the-art performance, but also demonstrate that CentralNet can actually choose the best possible fusion strategy for a given problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes