Symbolic Graph Inference for Compound Scene Understanding
This addresses scene understanding for applications like question-answering and robotics, but appears incremental as it builds on graph-based reasoning without claiming major breakthroughs.
The paper tackles the problem of scene understanding by reasoning over constituent objects and their arrangements, proposing a method that uses scene- and knowledge-graphs with joint graph search, and demonstrates feasibility on the ADE20K dataset compared to current approaches.
Scene understanding is a fundamental capability needed in many domains, ranging from question-answering to robotics. Unlike recent end-to-end approaches that must explicitly learn varying compositions of the same scene, our method reasons over their constituent objects and analyzes their arrangement to infer a scene's meaning. We propose a novel approach that reasons over a scene's scene- and knowledge-graph, capturing spatial information while being able to utilize general domain knowledge in a joint graph search. Empirically, we demonstrate the feasibility of our method on the ADE20K dataset and compare it to current scene understanding approaches.