Minoru Asada

RO
h-index25
11papers
40citations
Novelty51%
AI Score39

11 Papers

ROOct 20, 2023
Correspondence learning between morphologically different robots via task demonstrations

Hakan Aktas, Yukie Nagai, Minoru Asada et al.

We observe a large variety of robots in terms of their bodies, sensors, and actuators. Given the commonalities in the skill sets, teaching each skill to each different robot independently is inefficient and not scalable when the large variety in the robotic landscape is considered. If we can learn the correspondences between the sensorimotor spaces of different robots, we can expect a skill that is learned in one robot can be more directly and easily transferred to other robots. In this paper, we propose a method to learn correspondences among two or more robots that may have different morphologies. To be specific, besides robots with similar morphologies with different degrees of freedom, we show that a fixed-based manipulator robot with joint control and a differential drive mobile robot can be addressed within the proposed framework. To set up the correspondence among the robots considered, an initial base task is demonstrated to the robots to achieve the same goal. Then, a common latent representation is learned along with the individual robot policies for achieving the goal. After the initial learning stage, the observation of a new task execution by one robot becomes sufficient to generate a latent space representation pertaining to the other robots to achieve the same task. We verified our system in a set of experiments where the correspondence between robots is learned (1) when the robots need to follow the same paths to achieve the same task, (2) when the robots need to follow different trajectories to achieve the same task, and (3) when complexities of the required sensorimotor trajectories are different for the robots. We also provide a proof-of-the-concept realization of correspondence learning between a real manipulator robot and a simulated mobile robot.

ROMar 16
Exploring the dynamic properties and motion reproducibility of a small upper-body humanoid robot with 13-DOF pneumatic actuation for data-driven control

Hiroshi Atsuta, Hisashi Ishihara, Minoru Asada

Pneumatically-actuated anthropomorphic robots with high degrees of freedom (DOF) offer significant potential for physical human-robot interaction. However, precise control of pneumatic actuators is challenging due to their inherent nonlinearities. This paper presents the development of a compact 13-DOF upper-body humanoid robot. To assess the feasibility of an effective controller, we first investigate its key dynamic properties, such as actuation time delays, and confirm that the system exhibits highly reproducible behavior. Leveraging this reproducibility, we implement a preliminary data-driven controller for a 4-DOF arm subsystem based on a multilayer perceptron with explicit time delay compensation. The network was trained on random movement data to generate pressure commands for tracking arbitrary trajectories. Comparative evaluations with a traditional PID controller demonstrate superior trajectory tracking performance, highlighting the potential of data-driven approaches for controlling complex, high-DOF pneumatic robots.

ROApr 1, 2025
Interleaved Multitask Learning with Energy Modulated Learning Progress

Hanne Say, Suzan Ece Ada, Emre Ugur et al.

As humans learn new skills and apply their existing knowledge while maintaining previously learned information, "continual learning" in machine learning aims to incorporate new data while retaining and utilizing past knowledge. However, existing machine learning methods often does not mimic human learning where tasks are intermixed due to individual preferences and environmental conditions. Humans typically switch between tasks instead of completely mastering one task before proceeding to the next. To explore how human-like task switching can enhance learning efficiency, we propose a multi task learning architecture that alternates tasks based on task-agnostic measures such as "learning progress" and "neural computational energy expenditure". To evaluate the efficacy of our method, we run several systematic experiments by using a set of effect-prediction tasks executed by a simulated manipulator robot. The experiments show that our approach surpasses random interleaved and sequential task learning in terms of average learning accuracy. Moreover, by including energy expenditure in the task switching logic, our approach can still perform favorably while reducing neural energy expenditure.

LGJun 5, 2024
Oscillations enhance time-series prediction in reservoir computing with feedback

Yuji Kawai, Takashi Morita, Jihoon Park et al.

Reservoir computing, a machine learning framework used for modeling the brain, can predict temporal data with little observations and minimal computational resources. However, it is difficult to accurately reproduce the long-term target time series because the reservoir system becomes unstable. This predictive capability is required for a wide variety of time-series processing, including predictions of motor timing and chaotic dynamical systems. This study proposes oscillation-driven reservoir computing (ODRC) with feedback, where oscillatory signals are fed into a reservoir network to stabilize the network activity and induce complex reservoir dynamics. The ODRC can reproduce long-term target time series more accurately than conventional reservoir computing methods in a motor timing and chaotic time-series prediction tasks. Furthermore, it generates a time series similar to the target in the unexperienced period, that is, it can learn the abstract generative rules from limited observations. Given these significant improvements made by the simple and computationally inexpensive implementation, the ODRC would serve as a practical model of various time series data. Moreover, we will discuss biological implications of the ODRC, considering it as a model of neural oscillations and their cerebellar processors.

ROApr 24, 2024
Cross-Embodied Affordance Transfer through Learning Affordance Equivalences

Hakan Aktas, Yukie Nagai, Minoru Asada et al.

Affordances represent the inherent effect and action possibilities that objects offer to the agents within a given context. From a theoretical viewpoint, affordances bridge the gap between effect and action, providing a functional understanding of the connections between the actions of an agent and its environment in terms of the effects it can cause. In this study, we propose a deep neural network model that unifies objects, actions, and effects into a single latent vector in a common latent space that we call the affordance space. Using the affordance space, our system can generate effect trajectories when action and object are given and can generate action trajectories when effect trajectories and objects are given. Our model does not learn the behavior of individual objects acted upon by a single agent. Still, rather, it forms a `shared affordance representation' spanning multiple agents and objects, which we call Affordance Equivalence. Affordance Equivalence facilitates not only action generalization over objects but also Cross Embodiment transfer linking actions of different robots. In addition to the simulation experiments that demonstrate the proposed model's range of capabilities, we also showcase that our model can be used for direct imitation in real-world settings.

ROJun 18, 2021
High-level Features for Resource Economy and Fast Learning in Skill Transfer

Alper Ahmetoglu, Emre Ugur, Minoru Asada et al.

Abstraction is an important aspect of intelligence which enables agents to construct robust representations for effective decision making. In the last decade, deep networks are proven to be effective due to their ability to form increasingly complex abstractions. However, these abstractions are distributed over many neurons, making the re-use of a learned skill costly. Previous work either enforced formation of abstractions creating a designer bias, or used a large number of neural units without investigating how to obtain high-level features that may more effectively capture the source task. For avoiding designer bias and unsparing resource use, we propose to exploit neural response dynamics to form compact representations to use in skill transfer. For this, we consider two competing methods based on (1) maximum information compression principle and (2) the notion that abstract events tend to generate slowly changing signals, and apply them to the neural signals generated during task execution. To be concrete, in our simulation experiments, we either apply principal component analysis (PCA) or slow feature analysis (SFA) on the signals collected from the last hidden layer of a deep network while it performs a source task, and use these features for skill transfer in a new target task. We compare the generalization performance of these alternatives with the baselines of skill transfer with full layer output and no-transfer settings. Our results show that SFA units are the most successful for skill transfer. SFA as well as PCA, incur less resources compared to usual skill transfer, whereby many units formed show a localized response reflecting end-effector-obstacle-goal relations. Finally, SFA units with lowest eigenvalues resembles symbolic representations that highly correlate with high-level features such as joint angles which might be thought of precursors for fully symbolic systems.

DMJul 6, 2020
On the weight and density bounds of polynomial threshold functions

Erhan Oztop, Minoru Asada

In this report, we show that all n-variable Boolean function can be represented as polynomial threshold functions (PTF) with at most $0.75 \times 2^n$ non-zero integer coefficients and give an upper bound on the absolute value of these coefficients. To our knowledge this provides the best known bound on both the PTF density (number of monomials) and weight (sum of the coefficient magnitudes) of general Boolean functions. The special case of Bent functions is also analyzed and shown that any n-variable Bent function can be represented with integer coefficients less than $2^n$ while also obeying the aforementioned density bound. Finally, sparse Boolean functions, which are almost constant except for $m << 2^n$ number of variable assignments, are shown to have small weight PTFs with density at most $m+2^{n-1}$.

LGNov 1, 2019
Situated GAIL: Multitask imitation using task-conditioned adversarial inverse reinforcement learning

Kyoichiro Kobayashi, Takato Horii, Ryo Iwaki et al.

Generative adversarial imitation learning (GAIL) has attracted increasing attention in the field of robot learning. It enables robots to learn a policy to achieve a task demonstrated by an expert while simultaneously estimating the reward function behind the expert's behaviors. However, this framework is limited to learning a single task with a single reward function. This study proposes an extended framework called situated GAIL (S-GAIL), in which a task variable is introduced to both the discriminator and generator of the GAIL framework. The task variable has the roles of discriminating different contexts and making the framework learn different reward functions and policies for multiple tasks. To achieve the early convergence of learning and robustness during reward estimation, we introduce a term to adjust the entropy regularization coefficient in the generator's objective function. Our experiments using two setups (navigation in a discrete grid world and arm reaching in a continuous space) demonstrate that the proposed framework can acquire multiple reward functions and policies more effectively than existing frameworks. The task variable enables our framework to differentiate contexts while sharing common knowledge among multiple tasks.

LGNov 21, 2018
Compensated Integrated Gradients to Reliably Interpret EEG Classification

Kazuki Tachikawa, Yuji Kawai, Jihoon Park et al.

Integrated gradients are widely employed to evaluate the contribution of input features in classification models because it satisfies the axioms for attribution of prediction. This method, however, requires an appropriate baseline for reliable determination of the contributions. We propose a compensated integrated gradients method that does not require a baseline. In fact, the method compensates the attributions calculated by integrated gradients at an arbitrary baseline using Shapley sampling. We prove that the method retrieves reliable attributions if the processes of input features in a classifier are mutually independent, and they are identical like shared weights in convolutional neural networks. Using three electroencephalogram datasets, we experimentally demonstrate that the attributions of the proposed method are more reliable than those of the original integrated gradients, and its computational complexity is much lower than that of Shapley sampling.

AIOct 10, 2017
On- and Off-Policy Monotonic Policy Improvement

Ryo Iwaki, Minoru Asada

Monotonic policy improvement and off-policy learning are two main desirable properties for reinforcement learning algorithms. In this paper, by lower bounding the performance difference of two policies, we show that the monotonic policy improvement is guaranteed from on- and off-policy mixture samples. An optimization procedure which applies the proposed bound can be regarded as an off-policy natural policy gradient method. In order to support the theoretical result, we provide a trust region policy optimization method using experience replay as a naive application of our bound, and evaluate its performance in two classical benchmark problems.

LGOct 21, 2014
Where do goals come from? A Generic Approach to Autonomous Goal-System Development

Matthias Rolf, Minoru Asada

Goals express agents' intentions and allow them to organize their behavior based on low-dimensional abstractions of high-dimensional world states. How can agents develop such goals autonomously? This paper proposes a detailed conceptual and computational account to this longstanding problem. We argue to consider goals as high-level abstractions of lower-level intention mechanisms such as rewards and values, and point out that goals need to be considered alongside with a detection of the own actions' effects. We propose Latent Goal Analysis as a computational learning formulation thereof, and show constructively that any reward or value function can by explained by goals and such self-detection as latent mechanisms. We first show that learned goals provide a highly effective dimensionality reduction in a practical reinforcement learning problem. Then, we investigate a developmental scenario in which entirely task-unspecific rewards induced by visual saliency lead to self and goal representations that constitute goal-directed reaching.