ROMay 26, 2022
Multi-Phase Multi-Objective Dexterous Manipulation with Adaptive Hierarchical CurriculumLingfeng Tao, Jiucai Zhang, Xiaoli Zhang
Dexterous manipulation tasks usually have multiple objectives, and the priorities of these objectives may vary at different phases of a manipulation task. Varying priority makes a robot hardly or even failed to learn an optimal policy with a deep reinforcement learning (DRL) method. To solve this problem, we develop a novel Adaptive Hierarchical Reward Mechanism (AHRM) to guide the DRL agent to learn manipulation tasks with multiple prioritized objectives. The AHRM can determine the objective priorities during the learning process and update the reward hierarchy to adapt to the changing objective priorities at different phases. The proposed method is validated in a multi-objective manipulation task with a JACO robot arm in which the robot needs to manipulate a target with obstacles surrounded. The simulation and physical experiment results show that the proposed method improved robot learning in task performance and learning efficiency.
ROMay 26, 2022
Physics-Guided Hierarchical Reward Mechanism for Learning-Based Robotic GraspingYunsik Jung, Lingfeng Tao, Michael Bowman et al.
Learning-based grasping can afford real-time grasp motion planning of multi-fingered robotics hands thanks to its high computational efficiency. However, learning-based methods are required to explore large search spaces during the learning process. The search space causes low learning efficiency, which has been the main barrier to its practical adoption. In addition, the trained policy lacks a generalizable outcome unless objects are identical to the trained objects. In this work, we develop a novel Physics-Guided Deep Reinforcement Learning with a Hierarchical Reward Mechanism to improve learning efficiency and generalizability for learning-based autonomous grasping. Unlike conventional observation-based grasp learning, physics-informed metrics are utilized to convey correlations between features associated with hand structures and objects to improve learning efficiency and outcomes. Further, the hierarchical reward mechanism enables the robot to learn prioritized components of the grasping tasks. Our method is validated in robotic grasping tasks with a 3-finger MICO robot arm. The results show that our method outperformed the standard Deep Reinforcement Learning methods in various robotic grasping tasks.
CVNov 2, 2025
In-Context-Learning-Assisted Quality Assessment Vision-Language Models for Metal Additive ManufacturingQiaojie Zheng, Jiucai Zhang, Xiaoli Zhang
Vision-based quality assessment in additive manufacturing often requires dedicated machine learning models and application-specific datasets. However, data collection and model training can be expensive and time-consuming. In this paper, we leverage vision-language models' (VLMs') reasoning capabilities to assess the quality of printed parts and introduce in-context learning (ICL) to provide VLMs with necessary application-specific knowledge and demonstration samples. This method eliminates the requirement for large application-specific datasets for training models. We explored different sampling strategies for ICL to search for the optimal configuration that makes use of limited samples. We evaluated these strategies on two VLMs, Gemini-2.5-flash and Gemma3:27b, with quality assessment tasks in wire-laser direct energy deposition processes. The results show that ICL-assisted VLMs can reach quality classification accuracies similar to those of traditional machine learning models while requiring only a minimal number of samples. In addition, unlike traditional classification models that lack transparency, VLMs can generate human-interpretable rationales to enhance trust. Since there are no metrics to evaluate their interpretability in manufacturing applications, we propose two metrics, knowledge relevance and rationale validity, to evaluate the quality of VLMs' supporting rationales. Our results show that ICL-assisted VLMs can address application-specific tasks with limited data, achieving relatively high accuracy while also providing valid supporting rationales for improved decision transparency.
CVAug 20, 2025
QA-VLM: Providing human-interpretable quality assessment for wire-feed laser additive manufacturing parts with Vision Language ModelsQiaojie Zheng, Jiucai Zhang, Joy Gockel et al.
Image-based quality assessment (QA) in additive manufacturing (AM) often relies heavily on the expertise and constant attention of skilled human operators. While machine learning and deep learning methods have been introduced to assist in this task, they typically provide black-box outputs without interpretable justifications, limiting their trust and adoption in real-world settings. In this work, we introduce a novel QA-VLM framework that leverages the attention mechanisms and reasoning capabilities of vision-language models (VLMs), enriched with application-specific knowledge distilled from peer-reviewed journal articles, to generate human-interpretable quality assessments. Evaluated on 24 single-bead samples produced by laser wire direct energy deposition (DED-LW), our framework demonstrates higher validity and consistency in explanation quality than off-the-shelf VLMs. These results highlight the potential of our approach to enable trustworthy, interpretable quality assessment in AM applications.
CVJan 24, 2025
Enhancing accuracy of uncertainty estimation in appearance-based gaze tracking with probabilistic evaluation and calibrationQiaojie Zheng, Jiucai Zhang, Xiaoli Zhang
Accurately knowing uncertainties in appearance-based gaze tracking is critical for ensuring reliable downstream applications. Due to the lack of individual uncertainty labels, current uncertainty-aware approaches adopt probabilistic models to acquire uncertainties by following distributions in the training dataset. Without regulations, this approach lets the uncertainty model build biases and overfits the training data, leading to poor performance when deployed. We first presented a strict proper evaluation metric from the probabilistic perspective based on comparing the coverage probability between prediction and observation to provide quantitative evaluation for better assessment on the inferred uncertainties. We then proposed a correction strategy based on probability calibration to mitigate biases in the estimated uncertainties of the trained models. Finally, we demonstrated the effectiveness of the correction strategy with experiments performed on two popular gaze estimation datasets with distinctive image characteristics caused by data collection settings.
RODec 19, 2020
Forming Real-World Human-Robot Cooperation for Tasks With General GoalLingfeng Tao, Michael Bowman, Jiucai Zhang et al.
In human-robot cooperation, the robot cooperates with humans to accomplish the task together. Existing approaches assume the human has a specific goal during the cooperation, and the robot infers and acts toward it. However, in real-world environments, a human usually only has a general goal (e.g., general direction or area in motion planning) at the beginning of the cooperation, which needs to be clarified to a specific goal (i.e., an exact position) during cooperation. The specification process is interactive and dynamic, which depends on the environment and the partner's behavior. The robot that does not consider the goal specification process may cause frustration to the human partner, elongate the time to come to an agreement, and compromise team performance. This work presents the Evolutionary Value Learning approach to model the dynamics of the goal specification process with State-based Multivariate Bayesian Inference and goal specificity-related features. This model enables the robot to enhance the process of the human's goal specification actively and find a cooperative policy in a Deep Reinforcement Learning manner. Our method outperforms existing methods with faster goal specification processes and better team performance in a dynamic ball balancing task with real human subjects.
LGAug 28, 2020
Meta Reinforcement Learning-Based Lane Change Strategy for Autonomous VehiclesFei Ye, Pin Wang, Ching-Yao Chan et al.
Recent advances in supervised learning and reinforcement learning have provided new opportunities to apply related methodologies to automated driving. However, there are still challenges to achieve automated driving maneuvers in dynamically changing environments. Supervised learning algorithms such as imitation learning can generalize to new environments by training on a large amount of labeled data, however, it can be often impractical or cost-prohibitive to obtain sufficient data for each new environment. Although reinforcement learning methods can mitigate this data-dependency issue by training the agent in a trial-and-error way, they still need to re-train policies from scratch when adapting to new environments. In this paper, we thus propose a meta reinforcement learning (MRL) method to improve the agent's generalization capabilities to make automated lane-changing maneuvers at different traffic environments, which are formulated as different traffic congestion levels. Specifically, we train the model at light to moderate traffic densities and test it at a new heavy traffic density condition. We use both collision rate and success rate to quantify the safety and effectiveness of the proposed model. A benchmark model is developed based on a pretraining method, which uses the same network structure and training tasks as our proposed model for fair comparison. The simulation results shows that the proposed method achieves an overall success rate up to 20% higher than the benchmark model when it is generalized to the new environment of heavy traffic density. The collision rate is also reduced by up to 18% than the benchmark model. Finally, the proposed model shows more stable and efficient generalization capabilities adapting to the new environment, and it can achieve 100% successful rate and 0% collision rate with only a few steps of gradient updates.
ROMar 7, 2020
An Intent-based Task-aware Shared Control Framework for Intuitive Hands Free TelemanipulationMichael Bowman, Jiucai Zhang, Xiaoli Zhang
Shared control in teleoperation for providing robot assistance to accomplish object manipulation, called telemanipulation, is a new promising yet challenging problem. This has unique challenges--on top of teleoperation challenges in general--due to difficulties of physical discrepancy between human hands and robot hands as well as the fine motion constraints to constitute task success. We present an intuitive shared-control strategy where the focus is on generating robotic grasp poses which are better suited for human perception of successful teleoperated object manipulation and feeling of being in control of the robot, rather than developing objective stable grasp configurations for task success or following the human motion. The former is achieved by understanding human intent and autonomously taking over control on that inference. The latter is achieved by considering human inputs as hard motion constraints which the robot must abide. An arbitration of these two enables a trade-off for the subsequent robot motion to balance accomplishing the inferred task and motion constraints imposed by the operator. The arbitration framework adapts to the level of physical discrepancy between the human and different robot structures, enabling the assistance to indicate and appear to intuitively follow the user. To understand how users perceive good arbitration in object telemanipulation, we have conducted a user study with a hands-free telemanipulation setup to analyze the effect of factors including task predictability, perceived following, and user preference. The hands-free telemanipulation scene is chosen as the validation platform due to its more urgent need of intuitive robotics assistance for task success.
ROMar 7, 2020
Learn and Transfer Knowledge of Preferred Assistance Strategies in Semi-autonomous TelemanipulationLingfeng Tao, Michael Bowman, Xu Zhou et al.
Enabling robots to provide effective assistance yet still accommodating the operator's commands for telemanipulation of an object is very challenging because robot's assistive action is not always intuitive for human operators and human behaviors and preferences are sometimes ambiguous for the robot to interpret. Although various assistance approaches are being developed to improve the control quality from different optimization perspectives, the problem still remains in determining the appropriate approach that satisfies the fine motion constraints for the telemanipulation task and preference of the operator. To address these problems, we developed a novel preference-aware assistance knowledge learning approach. An assistance preference model learns what assistance is preferred by a human, and a stagewise model updating method ensures the learning stability while dealing with the ambiguity of human preference data. Such a preference-aware assistance knowledge enables a teleoperated robot hand to provide more active yet preferred assistance toward manipulation success. We also developed knowledge transfer methods to transfer the preference knowledge across different robot hand structures to avoid extensive robot-specific training. Experiments to telemanipulate a 3-finger hand and 2-finger hand, respectively, to use, move, and hand over a cup have been conducted. Results demonstrated that the methods enabled the robots to effectively learn the preference knowledge and allowed knowledge transfer between robots with less training effort.
ROMar 1, 2020
Learn Task First or Learn Human Partner First: A Hierarchical Task Decomposition Method for Human-Robot CooperationLingfeng Tao, Michael Bowman, Jiucai Zhang et al.
Applying Deep Reinforcement Learning (DRL) to Human-Robot Cooperation (HRC) in dynamic control problems is promising yet challenging as the robot needs to learn the dynamics of the controlled system and dynamics of the human partner. In existing research, the robot powered by DRL adopts coupled observation of the environment and the human partner to learn both dynamics simultaneously. However, such a learning strategy is limited in terms of learning efficiency and team performance. This work proposes a novel task decomposition method with a hierarchical reward mechanism that enables the robot to learn the hierarchical dynamic control task separately from learning the human partner's behavior. The method is validated with a hierarchical control task in a simulated environment with human subject experiments. Our method also provides insight into the design of the learning strategy for HRC. The results show that the robot should learn the task first to achieve higher team performance and learn the human first to achieve higher learning efficiency.
LGFeb 7, 2020
Automated Lane Change Strategy using Proximal Policy Optimization-based Deep Reinforcement LearningFei Ye, Xuxin Cheng, Pin Wang et al.
Lane-change maneuvers are commonly executed by drivers to follow a certain routing plan, overtake a slower vehicle, adapt to a merging lane ahead, etc. However, improper lane change behaviors can be a major cause of traffic flow disruptions and even crashes. While many rule-based methods have been proposed to solve lane change problems for autonomous driving, they tend to exhibit limited performance due to the uncertainty and complexity of the driving environment. Machine learning-based methods offer an alternative approach, as Deep reinforcement learning (DRL) has shown promising success in many application domains including robotic manipulation, navigation, and playing video games. However, applying DRL to autonomous driving still faces many practical challenges in terms of slow learning rates, sample inefficiency, and safety concerns. In this study, we propose an automated lane change strategy using proximal policy optimization-based deep reinforcement learning, which shows great advantages in learning efficiency while still maintaining stable performance. The trained agent is able to learn a smooth, safe, and efficient driving policy to make lane-change decisions (i.e. when and how) in a challenging situation such as dense traffic scenarios. The effectiveness of the proposed policy is validated by using metrics of task success rate and collision rate. The simulation results demonstrate the lane change maneuvers can be efficiently learned and executed in a safe, smooth, and efficient manner.
RODec 13, 2016
Stabilization and Trajectory Control of a Quadrotor with Uncertain Suspended LoadXu Zhou, Xiaoli Zhang, Jiucai Zhang et al.
Stabilization and trajectory control of a quadrotor carrying a suspended load with a fixed known mass has been extensively studied in recent years. However, the load mass is not always known beforehand or may vary during the practical transportations. This mass uncertainty brings uncertain disturbances to the quadrotor system, causing existing controllers to have worse stability and trajectory tracking performance. To improve the quadrotor stability and trajectory tracking capability in this situation, we fully investigate the impacts of the uncertain load mass on the quadrotor. By comparing the performances of three different controllers -- the proportional-derivative (PD) controller, the sliding mode controller (SMC), and the model predictive controller (MPC) -- stabilization rather than trajectory tracking error is proved to be the main influence in the load mass uncertainty. A critical motion mass exists for the quadrotor to maintain a desired transportation performance. Moreover, simulation results verify that a controller with strong robustness against disturbances is a good choice for practical applications.