Commonsense Scene Semantics for Cognitive Robotics: Towards Grounding Embodied Visuo-Locomotive Interactions
This work addresses the challenge of enabling robots to understand and interact with their environment in a human-like way, though it appears incremental as it builds on existing AI and visual processing methods.
The paper tackles the problem of grounding embodied visuo-locomotive interactions in cognitive robotics by developing a commonsense qualitative model that integrates low-level visual processing with high-level human-centered representations. It demonstrates practical applicability with examples of object interactions and indoor movement, but does not provide concrete numerical results.
We present a commonsense, qualitative model for the semantic grounding of embodied visuo-spatial and locomotive interactions. The key contribution is an integrative methodology combining low-level visual processing with high-level, human-centred representations of space and motion rooted in artificial intelligence. We demonstrate practical applicability with examples involving object interactions, and indoor movement.