CVAIMay 30, 2023

Epistemic Graph: A Plug-And-Play Module For Hybrid Representation Learning

arXiv:2305.18731v3
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing reliance on large datasets for deep models in vision tasks, offering a novel method for hybrid learning, though it appears incremental in its approach.

The paper tackles the challenge of integrating structured knowledge with data samples for more effective representation learning in computer vision, introducing a plug-and-play Epistemic Graph Layer that significantly improves performance in cross-domain recognition and few-shot learning tasks.

In recent years, deep models have achieved remarkable success in various vision tasks. However, their performance heavily relies on large training datasets. In contrast, humans exhibit hybrid learning, seamlessly integrating structured knowledge for cross-domain recognition or relying on a smaller amount of data samples for few-shot learning. Motivated by this human-like epistemic process, we aim to extend hybrid learning to computer vision tasks by integrating structured knowledge with data samples for more effective representation learning. Nevertheless, this extension faces significant challenges due to the substantial gap between structured knowledge and deep features learned from data samples, encompassing both dimensions and knowledge granularity. In this paper, a novel Epistemic Graph Layer (EGLayer) is introduced to enable hybrid learning, enhancing the exchange of information between deep features and a structured knowledge graph. Our EGLayer is composed of three major parts, including a local graph module, a query aggregation model, and a novel correlation alignment loss function to emulate human epistemic ability. Serving as a plug-and-play module that can replace the standard linear classifier, EGLayer significantly improves the performance of deep models. Extensive experiments demonstrates that EGLayer can greatly enhance representation learning for the tasks of cross-domain recognition and few-shot learning, and the visualization of knowledge graphs can aid in model interpretation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes