HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation
This addresses the robustness of SGG for applications like autonomous driving and robotics, but it is incremental as it builds on existing SGG methods with a new benchmark and refinement approach.
The paper tackles the problem of scene graph generation (SGG) under real-world corruptions like weather effects, by introducing a new benchmark with procedurally generated corruptions on Visual Genome and proposing HiKER-SGG, which uses a hierarchical knowledge graph to refine predictions and shows superior zero-shot performance on corrupted images and outperforms SOTA on uncorrupted tasks.
Being able to understand visual scenes is a precursor for many downstream tasks, including autonomous driving, robotics, and other vision-based approaches. A common approach enabling the ability to reason over visual data is Scene Graph Generation (SGG); however, many existing approaches assume undisturbed vision, i.e., the absence of real-world corruptions such as fog, snow, smoke, as well as non-uniform perturbations like sun glare or water drops. In this work, we propose a novel SGG benchmark containing procedurally generated weather corruptions and other transformations over the Visual Genome dataset. Further, we introduce a corresponding approach, Hierarchical Knowledge Enhanced Robust Scene Graph Generation (HiKER-SGG), providing a strong baseline for scene graph generation under such challenging setting. At its core, HiKER-SGG utilizes a hierarchical knowledge graph in order to refine its predictions from coarse initial estimates to detailed predictions. In our extensive experiments, we show that HiKER-SGG does not only demonstrate superior performance on corrupted images in a zero-shot manner, but also outperforms current state-of-the-art methods on uncorrupted SGG tasks. Code is available at https://github.com/zhangce01/HiKER-SGG.