RONov 20, 2022
Active Exploration based on Information Gain by Particle Filter for Efficient Spatial Concept FormationAkira Taniguchi, Yoshiki Tabuchi, Tomochika Ishikawa et al.
Autonomous robots need to learn the categories of various places by exploring their environments and interacting with users. However, preparing training datasets with linguistic instructions from users is time-consuming and labor-intensive. Moreover, effective exploration is essential for appropriate concept formation and rapid environmental coverage. To address this issue, we propose an active inference method, referred to as spatial concept formation with information gain-based active exploration (SpCoAE) that combines sequential Bayesian inference using particle filters and information gain-based destination determination in a probabilistic generative model. This study interprets the robot's action as a selection of destinations to ask the user, `What kind of place is this?' in the context of active inference. This study provides insights into the technical aspects of the proposed method, including active perception and exploration by the robot, and how the method can enable mobile robots to learn spatial concepts through active exploration. Our experiment demonstrated the effectiveness of the SpCoAE in efficiently determining a destination for learning appropriate spatial concepts in home environments.
47.3ROMar 30
Reducing Mental Workload through On-Demand Human Assistance for Physical Action Failures in LLM-based Multi-Robot CoordinationShoichi Hasegawa, Akira Taniguchi, Lotfi El Hafi et al.
Multi-robot coordination based on large language models (LLMs) has attracted growing attention, since LLMs enable the direct translation of natural language instructions into robot action plans by decomposing tasks and generating high-level plans. However, recovering from physical execution failures remains difficult, and tasks often stagnate due to the repetition of the same unsuccessful actions. While frameworks for remote robot operation using Mixed Reality were proposed, there have been few attempts to implement remote error resolution specifically for physical failures in multi-robot environments. In this study, we propose REPAIR (Robot Execution with Planned And Interactive Recovery), a human-in-the-loop framework that integrates remote error resolution into LLM-based multi-robot planning. In this method, robots execute tasks autonomously; however, when an irrecoverable failure occurs, the LLM requests assistance from an operator, enabling task continuity through remote intervention. Evaluations using a multi-robot trash collection task in a real-world environment confirmed that REPAIR significantly improves task progress (the number of items cleared within a time limit) compared to fully autonomous methods. Furthermore, for easily collectable items, it achieved task progress equivalent to full remote control. The results also suggested that the mental workload on the operator may differ in terms of physical demand and effort. The project website is https://emergentsystemlabstudent.github.io/REPAIR/.
ROSep 16, 2025
Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative ModelSaki Hashimoto, Shoichi Hasegawa, Tomochika Ishikawa et al.
Robots operating in domestic and office environments must understand object ownership to correctly execute instructions such as ``Bring me my cup.'' However, ownership cannot be reliably inferred from visual features alone. To address this gap, we propose Active Ownership Learning (ActOwL), a framework that enables robots to actively generate and ask ownership-related questions to users. ActOwL employs a probabilistic generative model to select questions that maximize information gain, thereby acquiring ownership knowledge efficiently to improve learning efficiency. Additionally, by leveraging commonsense knowledge from Large Language Models (LLM), objects are pre-classified as either shared or owned, and only owned objects are targeted for questioning. Through experiments in a simulated home environment and a real-world laboratory setting, ActOwL achieved significantly higher ownership clustering accuracy with fewer questions than baseline methods. These findings demonstrate the effectiveness of combining active inference with LLM-guided commonsense reasoning, advancing the capability of robots to acquire ownership knowledge for practical and socially appropriate task execution.
ROSep 16, 2025
Multi-Robot Task Planning for Multi-Object Retrieval Tasks with Distributed On-Site Knowledge via Large Language ModelsKento Murata, Shoichi Hasegawa, Tomochika Ishikawa et al.
It is crucial to efficiently execute instructions such as "Find an apple and a banana" or "Get ready for a field trip," which require searching for multiple objects or understanding context-dependent commands. This study addresses the challenging problem of determining which robot should be assigned to which part of a task when each robot possesses different situational on-site knowledge-specifically, spatial concepts learned from the area designated to it by the user. We propose a task planning framework that leverages large language models (LLMs) and spatial concepts to decompose natural language instructions into subtasks and allocate them to multiple robots. We designed a novel few-shot prompting strategy that enables LLMs to infer required objects from ambiguous commands and decompose them into appropriate subtasks. In our experiments, the proposed method achieved 47/50 successful assignments, outperforming random (28/50) and commonsense-based assignment (26/50). Furthermore, we conducted qualitative evaluations using two actual mobile manipulators. The results demonstrated that our framework could handle instructions, including those involving ad hoc categories such as "Get ready for a field trip," by successfully performing task decomposition, assignment, sequential planning, and execution.
ROApr 15, 2024
Real-world Instance-specific Image Goal Navigation: Bridging Domain Gaps via Contrastive LearningTaichi Sakaguchi, Akira Taniguchi, Yoshinobu Hagiwara et al.
Improving instance-specific image goal navigation (InstanceImageNav), which locates the identical object in a real-world environment from a query image, is essential for robotic systems to assist users in finding desired objects. The challenge lies in the domain gap between low-quality images observed by the moving robot, characterized by motion blur and low-resolution, and high-quality query images provided by the user. Such domain gaps could significantly reduce the task success rate but have not been the focus of previous work. To address this, we propose a novel method called Few-shot Cross-quality Instance-aware Adaptation (CrossIA), which employs contrastive learning with an instance classifier to align features between massive low- and few high-quality images. This approach effectively reduces the domain gap by bringing the latent representations of cross-quality images closer on an instance basis. Additionally, the system integrates an object image collection with a pre-trained deblurring model to enhance the observed image quality. Our method fine-tunes the SimSiam model, pre-trained on ImageNet, using CrossIA. We evaluated our method's effectiveness through an InstanceImageNav task with 20 different types of instances, where the robot identifies the same instance in a real-world environment as a high-quality query image. Our experiments showed that our method improves the task success rate by up to three times compared to the baseline, a conventional approach based on SuperGlue. These findings highlight the potential of leveraging contrastive learning and image enhancement techniques to bridge the domain gap and improve object localization in robotic applications. The project website is https://emergentsystemlabstudent.github.io/DomainBridgingNav/.
ROMar 16, 2021
Map completion from partial observation using the global structure of multiple environmental mapsYuki Katsumata, Akinori Kanechika, Akira Taniguchi et al.
Using the spatial structure of various indoor environments as prior knowledge, the robot would construct the map more efficiently. Autonomous mobile robots generally apply simultaneous localization and mapping (SLAM) methods to understand the reachable area in newly visited environments. However, conventional mapping approaches are limited by only considering sensor observation and control signals to estimate the current environment map. This paper proposes a novel SLAM method, map completion network-based SLAM (MCN-SLAM), based on a probabilistic generative model incorporating deep neural networks for map completion. These map completion networks are primarily trained in the framework of generative adversarial networks (GANs) to extract the global structure of large amounts of existing map data. We show in experiments that the proposed method can estimate the environment map 1.3 times better than the previous SLAM methods in the situation of partial observation.
ROFeb 10, 2020
Autonomous Planning Based on Spatial Concepts to Tidy Up Home Environments with Service RobotsAkira Taniguchi, Shota Isobe, Lotfi El Hafi et al.
Tidy-up tasks by service robots in home environments are challenging in robotics applications because they involve various interactions with the environment. In particular, robots are required not only to grasp, move, and release various home objects but also to plan the order and positions for placing the objects. In this paper, we propose a novel planning method that can efficiently estimate the order and positions of the objects to be tidied up by learning the parameters of a probabilistic generative model. The model allows a robot to learn the distributions of the co-occurrence probability of the objects and places to tidy up using the multimodal sensor information collected in a tidied environment. Additionally, we develop an autonomous robotic system to perform the tidy-up operation. We evaluate the effectiveness of the proposed method by an experimental simulation that reproduces the conditions of the Tidy Up Here task of the World Robot Summit 2018 international robotics competition. The simulation results show that the proposed method enables the robot to successively tidy up several objects and achieves the best task score among the considered baseline tidy-up methods.