Yoshihiko Nakamura

RO
10papers
513citations
Novelty57%
AI Score46

10 Papers

43.8ROMar 17
Enforcing Task-Specified Compliance Bounds for Humanoids via Anisotropic Lipschitz-Constrained Policies

Zewen He, Yoshihiko Nakamura

Reinforcement learning (RL) has demonstrated substantial potential for humanoid bipedal locomotion and the control of complex motions. To cope with oscillations and impacts induced by environmental interactions, compliant control is widely regarded as an effective remedy. However, the model-free nature of RL makes it difficult to impose task-specified and quantitatively verifiable compliance objectives, and classical model-based stiffness designs are not directly applicable. Lipschitz-Constrained Policies (LCP), which regularize the local sensitivity of a policy via gradient penalties, have recently been used to smooth humanoid motions. Nevertheless, existing LCP-based methods typically employ a single scalar Lipschitz budget and lack an explicit connection to physically meaningful compliance specifications in real-world systems. In this study, we propose an anisotropic Lipschitz-constrained policy (ALCP) that maps a task-space stiffness upper bound to a state-dependent Lipschitz-style constraint on the policy Jacobian. The resulting constraint is enforced during RL training via a hinge-squared spectral-norm penalty, preserving physical interpretability while enabling direction-dependent compliance. Experiments on humanoid robots show that ALCP improves locomotion stability and impact robustness, while reducing oscillations and energy usage.

ROMar 9
Choose What to Observe: Task-Aware Semantic-Geometric Representations for Visuomotor Policy

Haoran Ding, Liang Ma, Yaxun Yang et al.

Visuomotor policies learned from demonstrations often overfit to nuisance visual factors in raw RGB observations, resulting in brittle behavior under appearance shifts such as background changes and object recoloring. We propose a task-aware observation interface that canonicalizes visual input into a shared representation, improving robustness to out-of-distribution (OOD) appearance changes without modifying or fine-tuning the policy. Given an RGB image and an open-vocabulary specification of task-relevant entities, we use SAM3 to segment the target object and robot/gripper. We construct an L0 observation by repainting segmented entities with predefined semantic colors on a constant background. For tasks requiring stronger geometric cues, we further inject monocular depth from Depth Anything 3 into the segmented regions via depth-guided overwrite, yielding a unified semantic--geometric observation (L1) that remains a standard 3-channel, image-like input. We evaluate on RoboMimic (Lift), ManiSkill YCB grasping under clutter, four RLBench tasks under controlled appearance shifts, and two real-world Franka tasks (ReachX and CloseCabinet). Across benchmarks and policy backbones (Flow Matching Policy and SmolVLA), our interface preserves in-distribution performance while substantially improving robustness under OOD visual shifts.

CVJul 4, 2020
SplitFusion: Simultaneous Tracking and Mapping for Non-Rigid Scenes

Yang Li, Tianwei Zhang, Yoshihiko Nakamura et al.

We present SplitFusion, a novel dense RGB-D SLAM framework that simultaneously performs tracking and dense reconstruction for both rigid and non-rigid components of the scene. SplitFusion first adopts deep learning based semantic instant segmentation technique to split the scene into rigid or non-rigid surfaces. The split surfaces are independently tracked via rigid or non-rigid ICP and reconstructed through incremental depth map fusion. Experimental results show that the proposed approach can provide not only accurate environment maps but also well-reconstructed non-rigid targets, e.g. the moving humans.

ROMar 11, 2020
FlowFusion: Dynamic Dense RGB-D SLAM Based on Optical Flow

Tianwei Zhang, Huayan Zhang, Yang Li et al.

Dynamic environments are challenging for visual SLAM since the moving objects occlude the static environment features and lead to wrong camera motion estimation. In this paper, we present a novel dense RGB-D SLAM solution that simultaneously accomplishes the dynamic/static segmentation and camera ego-motion estimation as well as the static background reconstructions. Our novelty is using optical flow residuals to highlight the dynamic semantics in the RGB-D point clouds and provide more accurate and efficient dynamic/static segmentation for camera tracking and background reconstruction. The dense reconstruction results on public datasets and real dynamic scenes indicate that the proposed approach achieved accurate and efficient performances in both dynamic and static environments compared to state-of-the-art approaches.

CVJan 16, 2020
Synergetic Reconstruction from 2D Pose and 3D Motion for Wide-Space Multi-Person Video Motion Capture in the Wild

Takuya Ohashi, Yosuke Ikegami, Yoshihiko Nakamura

Although many studies have investigated markerless motion capture, the technology has not been applied to real sports or concerts. In this paper, we propose a markerless motion capture method with spatiotemporal accuracy and smoothness from multiple cameras in wide-space and multi-person environments. The proposed method predicts each person's 3D pose and determines the bounding box of multi-camera images small enough. This prediction and spatiotemporal filtering based on human skeletal model enables 3D reconstruction of the person and demonstrates high-accuracy. The accurate 3D reconstruction is then used to predict the bounding box of each camera image in the next frame. This is feedback from the 3D motion to 2D pose, and provides a synergetic effect on the overall performance of video motion capture. We evaluated the proposed method using various datasets and a real sports field. The experimental results demonstrate that the mean per joint position error (MPJPE) is 31.5 mm and the percentage of correct parts (PCP) is 99.5% for five people dynamically moving while satisfying the range of motion (RoM). Video demonstration, datasets, and additional materials are posted on our project page.

RODec 9, 2019
Video Motion Capture from the Part Confidence Maps of Multi-Camera Images by Spatiotemporal Filtering Using the Human Skeletal Model

Takuya Ohashi, Yosuke Ikegami, Kazuki Yamamoto et al.

This paper discusses video motion capture, namely, 3D reconstruction of human motion from multi-camera images. After the Part Confidence Maps are computed from each camera image, the proposed spatiotemporal filter is applied to deliver the human motion data with accuracy and smoothness for human motion analysis. The spatiotemporal filter uses the human skeleton and mixes temporal smoothing in two-time inverse kinematics computations. The experimental results show that the mean per joint position error was 26.1mm for regular motions and 38.8mm for inverted motions.

RONov 17, 2015
Completeness of Randomized Kinodynamic Planners with State-based Steering

Stéphane Caron, Quang-Cuong Pham, Yoshihiko Nakamura

Probabilistic completeness is an important property in motion planning. Although it has been established with clear assumptions for geometric planners, the panorama of completeness results for kinodynamic planners is still incomplete, as most existing proofs rely on strong assumptions that are difficult, if not impossible, to verify on practical systems. In this paper, we focus on an important class of kinodynamic planners, namely those that interpolate trajectories in the state space. We provide a proof of probabilistic completeness for these planners under assumptions that can be readily verified from the system's equations of motion and the user-defined interpolation function. Our proof relies crucially on a property of interpolated trajectories, termed second-order continuity (SOC), which we show is tightly related to the ability of a planner to benefit from denser sampling. We analyze the impact of this property in simulations on a low-torque pendulum. Our results show that a simple RRT using a second-order continuous interpolation swiftly finds solution, while it is impossible for the same planner using standard Bezier curves (which are not SOC) to find any solution.

ROOct 12, 2015
ZMP support areas for multi-contact mobility under frictional constraints

Stéphane Caron, Quang-Cuong Pham, Yoshihiko Nakamura

We propose a method for checking and enforcing multi-contact stability based on the Zero-tilting Moment Point (ZMP). The key to our development is the generalization of ZMP support areas to take into account (a) frictional constraints and (b) multiple non-coplanar contacts. We introduce and investigate two kinds of ZMP support areas. First, we characterize and provide a fast geometric construction for the support area generated by valid contact forces, with no other constraint on the robot motion. We call this set the full support area. Next, we consider the control of humanoid robots using the Linear Pendulum Mode (LPM). We observe that the constraints stemming from the LPM induce a shrinking of the support area, even for walking on horizontal floors. We propose an algorithm to compute the new area, which we call pendular support area. We show that, in the LPM, having the ZMP in the pendular support area is a necessary and sufficient condition for contact stability. Based on these developments, we implement a whole-body controller and generate feasible multi-contact motions where an HRP-4 humanoid locomotes in challenging multi-contact scenarios.

ROJan 20, 2015
Stability of Surface Contacts for Humanoid Robots: Closed-Form Formulae of the Contact Wrench Cone for Rectangular Support Areas

Stéphane Caron, Quang-Cuong Pham, Yoshihiko Nakamura

Humanoid robots locomote by making and breaking contacts with their environment. A crucial problem is therefore to find precise criteria for a given contact to remain stable or to break. For rigid surface contacts, the most general criterion is the Contact Wrench Condition (CWC). To check whether a motion satisfies the CWC, existing approaches take into account a large number of individual contact forces (for instance, one at each vertex of the support polygon), which is computationally costly and prevents the use of efficient inverse-dynamics methods. Here we argue that the CWC can be explicitly computed without reference to individual contact forces, and give closed-form formulae in the case of rectangular surfaces -- which is of practical importance. It turns out that these formulae simply and naturally express three conditions: (i) Coulomb friction on the resultant force, (ii) ZMP inside the support area, and (iii) bounds on the yaw torque. Conditions (i) and (ii) are already known, but condition (iii) is, to the best of our knowledge, novel. It is also of particular interest for biped locomotion, where undesired foot yaw rotations are a known issue. We also show that our formulae yield simpler and faster computations than existing approaches for humanoid motions in single support, and demonstrate their consistency in the OpenHRP simulator.

RONov 14, 2014
Admissible Velocity Propagation : Beyond Quasi-Static Path Planning for High-Dimensional Robots

Quang-Cuong Pham, Stéphane Caron, Puttichai Lertkultanon et al.

Path-velocity decomposition is an intuitive yet powerful approach to address the complexity of kinodynamic motion planning. The difficult trajectory planning problem is solved in two separate, simpler, steps: first, find a path in the configuration space that satisfies the geometric constraints (path planning), and second, find a time-parameterization of that path satisfying the kinodynamic constraints. A fundamental requirement is that the path found in the first step should be time-parameterizable. Most existing works fulfill this requirement by enforcing quasi-static constraints in the path planning step, resulting in an important loss in completeness. We propose a method that enables path-velocity decomposition to discover truly dynamic motions, i.e. motions that are not quasi-statically executable. At the heart of the proposed method is a new algorithm -- Admissible Velocity Propagation -- which, given a path and an interval of reachable velocities at the beginning of that path, computes exactly and efficiently the interval of all the velocities the system can reach after traversing the path while respecting the system kinodynamic constraints. Combining this algorithm with usual sampling-based planners then gives rise to a family of new trajectory planners that can appropriately handle kinodynamic constraints while retaining the advantages associated with path-velocity decomposition. We demonstrate the efficiency of the proposed method on some difficult kinodynamic planning problems, where, in particular, quasi-static methods are guaranteed to fail.