OBSER: Object-Based Sub-Environment Recognition for Zero-Shot Environmental Inference
This work addresses autonomous environment understanding for robotics or AI systems, but appears incremental as it builds on existing metric and self-supervised learning methods.
The paper tackles the problem of zero-shot environmental inference by proposing the OBSER framework, which uses object-based Bayesian relationships and achieves reliable performance in open-world and photorealistic environments, outperforming scene-based methods in chained retrieval tasks.
We present the Object-Based Sub-Environment Recognition (OBSER) framework, a novel Bayesian framework that infers three fundamental relationships between sub-environments and their constituent objects. In the OBSER framework, metric and self-supervised learning models estimate the object distributions of sub-environments on the latent space to compute these measures. Both theoretically and empirically, we validate the proposed framework by introducing the ($ε,δ$) statistically separable (EDS) function which indicates the alignment of the representation. Our framework reliably performs inference in open-world and photorealistic environments and outperforms scene-based methods in chained retrieval tasks. The OBSER framework enables zero-shot recognition of environments to achieve autonomous environment understanding.