CVNov 27, 2017

Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams

arXiv:1711.09528v128 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of extracting relational knowledge from diagrams for applications in AI and data analysis, representing an incremental advance in multimodal understanding.

The paper tackles the problem of automatically understanding diagrams, which contain rich multimodal information but are challenging due to arbitrary layouts, by proposing a dynamic graph-generation network that integrates object detection and recurrent neural networks. It achieves state-of-the-art results on public diagram datasets and shows potential for applications like question answering.

In this work, we introduce a new algorithm for analyzing a diagram, which contains visual and textual information in an abstract and integrated way. Whereas diagrams contain richer information compared with individual image-based or language-based data, proper solutions for automatically understanding them have not been proposed due to their innate characteristics of multi-modality and arbitrariness of layouts. To tackle this problem, we propose a unified diagram-parsing network for generating knowledge from diagrams based on an object detector and a recurrent neural network designed for a graphical structure. Specifically, we propose a dynamic graph-generation network that is based on dynamic memory and graph theory. We explore the dynamics of information in a diagram with activation of gates in gated recurrent unit (GRU) cells. On publicly available diagram datasets, our model demonstrates a state-of-the-art result that outperforms other baselines. Moreover, further experiments on question answering shows potentials of the proposed method for various applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes