Yifan Hou

h-index11

8papers

166citations

Novelty49%

AI Score34

Ranked #112,382 of 194,257 authors (top 58%)#3,355 in RO (top 50%)

8 Papers

11.8ROJul 24, 2023

simPLE: a visuotactile method learned in simulation to precisely pick, localize, regrasp, and place objects

Maria Bauza, Antonia Bronars, Yifan Hou et al.

Existing robotic systems have a clear tension between generality and precision. Deployed solutions for robotic manipulation tend to fall into the paradigm of one robot solving a single task, lacking precise generalization, i.e., the ability to solve many tasks without compromising on precision. This paper explores solutions for precise and general pick-and-place. In precise pick-and-place, i.e. kitting, the robot transforms an unstructured arrangement of objects into an organized arrangement, which can facilitate further manipulation. We propose simPLE (simulation to Pick Localize and PLacE) as a solution to precise pick-and-place. simPLE learns to pick, regrasp and place objects precisely, given only the object CAD model and no prior experience. We develop three main components: task-aware grasping, visuotactile perception, and regrasp planning. Task-aware grasping computes affordances of grasps that are stable, observable, and favorable to placing. The visuotactile perception model relies on matching real observations against a set of simulated ones through supervised learning. Finally, we compute the desired robot motion by solving a shortest path problem on a graph of hand-to-hand regrasps. On a dual-arm robot equipped with visuotactile sensing, we demonstrate pick-and-place of 15 diverse objects with simPLE. The objects span a wide range of shapes and simPLE achieves successful placements into structured arrangements with 1mm clearance over 90% of the time for 6 objects, and over 80% of the time for 11 objects. Videos are available at http://mcube.mit.edu/research/simPLE.html .

14.1CLNov 11, 2024

Explore the Reasoning Capability of LLMs in the Chess Testbed

Shu Wang, Lei Ji, Renxi Wang et al.

Reasoning is a central capability of human intelligence. In recent years, with the advent of large-scale datasets, pretrained large language models have emerged with new capabilities, including reasoning. However, these models still struggle with long-term, complex reasoning tasks, such as playing chess. Based on the observation that expert chess players employ a dual approach combining long-term strategic play with short-term tactical play along with language explanation, we propose improving the reasoning capability of large language models in chess by integrating annotated strategy and tactic. Specifically, we collect a dataset named MATE, which consists of 1 million chess positions with candidate moves annotated by chess experts for strategy and tactics. We finetune the LLaMA-3-8B model and compare it against state-of-the-art commercial language models in the task of selecting better chess moves. Our experiments show that our models perform better than GPT, Claude, and Gemini models. We find that language explanations can enhance the reasoning capability of large language models.

25.3ROMay 30, 2025

DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation

Zhao Mandi, Yifan Hou, Dieter Fox et al.

We study the problem of functional retargeting: learning dexterous manipulation policies to track object states from human hand-object demonstrations. We focus on long-horizon, bimanual tasks with articulated objects, which is challenging due to large action space, spatiotemporal discontinuities, and embodiment gap between human and robot hands. We propose DexMachina, a novel curriculum-based algorithm: the key idea is to use virtual object controllers with decaying strength: an object is first driven automatically towards its target states, such that the policy can gradually learn to take over under motion and contact guidance. We release a simulation benchmark with a diverse set of tasks and dexterous hands, and show that DexMachina significantly outperforms baseline methods. Our algorithm and benchmark enable a functional comparison for hardware designs, and we present key findings informed by quantitative and qualitative results. With the recent surge in dexterous hand development, we hope this work will provide a useful platform for identifying desirable hardware capabilities and lower the barrier for contributing to future research. Videos and more at https://project-dexmachina.github.io/

2.2RONov 10, 2020

An Efficient Closed-Form Method for Optimal Hybrid Force-Velocity Control

Yifan Hou, Matthew T. Mason

This paper derives a closed-form method for computing hybrid force-velocity control. The key idea is to maximize the kinematic conditioning of the mechanical system, which includes a robot, free objects, a rigid environment and contact constraints. The method is complete, in that it always produces an optimal/near optimal solution when a solution exists. It is efficient, since it is in closed form, avoiding the iterative search of previous work. We test the method on 78,000 randomly generated test cases. The method outperforms our previous search-based technique by being from 7 to 40 times faster, while consistently producing better solutions in the sense of robustness to kinematic singularity. We also test the method in several representative manipulation experiments.

13.7RONov 3, 2020Code

Contact Mode Guided Sampling-Based Planning for Quasistatic Dexterous Manipulation in 2D

Xianyi Cheng, Eric Huang, Yifan Hou et al.

The discontinuities and multi-modality introduced by contacts make manipulation planning challenging. Many previous works avoid this problem by pre-designing a set of high-level motion primitives like grasping and pushing. However, such motion primitives are often not adequate to describe dexterous manipulation motions. In this work, we propose a method for dexterous manipulation planning at a more primitive level. The key idea is to use contact modes to guide the search in a sampling-based planning framework. Our method can automatically generate contact transitions and motion trajectories under the quasistatic assumption. In the experiments, this method sometimes generates motions that are often pre-designed as motion primitives, as well as dexterous motions that are more task-specific.

13.7ROJun 4, 2020

Manipulation with Shared Grasping

Yifan Hou, Zhenzhong Jia, Matthew T. Mason

A shared grasp is a grasp formed by contacts between the manipulated object and both the robot hand and the environment. By trading off hand contacts for environmental contacts, a shared grasp requires fewer contacts with the hand, and enables manipulation even when a full grasp is not possible. Previous research has used shared grasps for non-prehensile manipulation such as pivoting and tumbling. This paper treats the problem more generally, with methods to select the best shared grasp and robot actions for a desired object motion. The central issue is to evaluate the feasible contact modes: for each contact, whether that contact will remain active, and whether slip will occur. Robustness is important. When a contact mode fails, e.g., when a contact is lost, or when unintentional slip occurs, the operation will fail, and in some cases damage may occur. In this work, we enumerate all feasible contact modes, calculate corresponding controls, and select the most robust candidate. We can also optimize the contact geometry for robustness. This paper employs quasi-static analysis of planar rigid bodies with Coulomb friction to derive the algorithms and controls. Finally, we demonstrate the robustness of shared grasping and the use of our methods in representative experiments and examples. The video can be found at https://youtu.be/tyNhJvRYZNk

4.9RODec 5, 2019

Reorienting Objects in 3D Space Using Pivoting

Yifan Hou, Zhenzhong Jia, Matthew T. Mason

We consider the problem of reorienting a rigid object with arbitrary known shape on a table using a two-finger pinch gripper. Reorienting problem is challenging because of its non-smoothness and high dimensionality. In this work, we focus on solving reorienting using pivoting, in which we allow the grasped object to rotate between fingers. Pivoting decouples the gripper rotation from the object motion, making it possible to reorient an object under strict robot workspace constraints. We provide detailed mechanical analysis to the 3D pivoting motion on a table, which leads to simple geometric conditions for its stability. To solve reorienting problems, we introduce two motion primitives: pivot-on-support and roll-on-support, and provide an efficient hierarchical motion planning algorithm with the two motion primitives to solve for the gripper motions that reorient an object between arbitrary poses. To handle the uncertainties in modeling and perception, we make conservative plans that work in the worst case, and propose a robust control strategy for executing the motion plan. Finally we discuss the mechanical requirements on the robot and provide a "two-phase" gripper design to implement both pivoting grasp and firm grasp. We demonstrate the effectiveness of our method in simulations and multiple experiments. Our algorithm can solve more reorienting problems with fewer making and breaking contacts, when compared to traditional pick-and-place based methods.

7.6ROOct 14, 2017

Hybrid DDP in Clutter (CHDDP): Trajectory Optimization for Hybrid Dynamical System in Cluttered Environments

Shushman Choudhury, Yifan Hou, Gilwoo Lee et al.

We present an algorithm for obtaining an optimal control policy for hybrid dynamical systems in cluttered environments. To the best of our knowledge, this is the first attempt to have a locally optimal solution for this specific problem setting. Our approach extends an optimal control algorithm for hybrid dynamical systems in the obstacle-free case to environments with obstacles. Our method does not require any preset mode sequence or heuristics to prune the exponential search of mode sequences. By first solving the relaxed problem of getting an obstacle-free, dynamically feasible trajectory and then solving for both obstacle-avoidance and optimality, we can generate smooth, locally optimal control policies. We demonstrate the performance of our algorithm on a box-pushing example in a number of environments against the baseline of randomly sampling modes and actions with a Kinodynamic RRT.