David Navarro-Alarcon

RO
h-index22
34papers
811citations
Novelty46%
AI Score50

34 Papers

ROMay 27
World Models for Robotic Manipulation: A Survey

Fangyuan Wang, Ziyuan Wang, Guorui Pei et al.

Robotic manipulation depends on the ability to anticipate how actions reshape objects, contacts, and scene geometry before execution. Learned world models provide this capability by predicting task-relevant future evolution under robot intervention, yet the term now spans latent dynamics models, action-conditioned video generators, three- and four-dimensional scene predictors, physics-informed simulators, and predictive modules inside vision-language-action systems. This breadth has fragmented the literature and obscured the design choices that matter for manipulation. We survey world models for robotic manipulation through three questions: what future representation is predicted, how prediction is connected to action, and when prediction is used in the robot-learning pipeline. We operationally define a world model as an action-conditioned predictive system and distinguish it from perception modules, inverse models, policies, rewards, and value functions. We then organize existing work into five representation families, develop a functional taxonomy that separates integrated prediction-action models from explicit predictive planners, and characterize infrastructure roles including synthetic experience generation, candidate filtering, search-based evaluation, learned environments, and outcome verification. We further map these roles across pretraining, post-training, and inference adaptation, review 34 manipulation datasets, and synthesize evaluation protocols for predictive fidelity, task performance, and simulator reliability. This survey shows that world models are evolving from task-specific dynamics predictors into predictive infrastructure for robot learning, while exposing open challenges in contact modeling, hallucination control, action alignment, and benchmarking under closed-loop use.

ROApr 22
Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics

Open-H-Embodiment Consortium, Nigel Nelson, Juo-Tung Chen et al.

Autonomous medical robots hold promise to improve patient outcomes, reduce provider workload, democratize access to care, and enable superhuman precision. However, autonomous medical robotics has been limited by a fundamental data problem: existing medical robotic datasets are small, single-embodiment, and rarely shared openly, restricting the development of foundation models that the field needs to advance. We introduce Open-H-Embodiment, the largest open dataset of medical robotic video with synchronized kinematics to date, spanning more than 49 institutions and multiple robotic platforms including the CMR Versius, Intuitive Surgical's da Vinci, da Vinci Research Kit (dVRK), Rob Surgical BiTrack, Virtual Incision's MIRA, Moon Surgical Maestro, and a variety of custom systems, spanning surgical manipulation, robotic ultrasound, and endoscopy procedures. We demonstrate the research enabled by this dataset through two foundation models. GR00T-H is the first open foundation vision-language-action model for medical robotics, which is the only evaluated model to achieve full end-to-end task completion on a structured suturing benchmark (25% of trials vs. 0% for all others) and achieves 64% average success across a 29-step ex vivo suturing sequence. We also train Cosmos-H-Surgical-Simulator, the first action-conditioned world model to enable multi-embodiment surgical simulation from a single checkpoint, spanning nine robotic platforms and supporting in silico policy evaluation and synthetic data generation for the medical domain. These results suggest that open, large-scale medical robot data collection can serve as critical infrastructure for the research community, enabling advances in robot learning, world modeling, and beyond.

ROFeb 6
Think Proprioceptively: Embodied Visual Reasoning for VLA Manipulation

Fangyuan Wang, Peng Zhou, Jiaming Qi et al.

Vision-language-action (VLA) models typically inject proprioception only as a late conditioning signal, which prevents robot state from shaping instruction understanding and from influencing which visual tokens are attended throughout the policy. We introduce ThinkProprio, which converts proprioception into a sequence of text tokens in the VLM embedding space and fuses them with the task instruction at the input. This early fusion lets embodied state participate in subsequent visual reasoning and token selection, biasing computation toward action-critical evidence while suppressing redundant visual tokens. In a systematic ablation over proprioception encoding, state entry point, and action-head conditioning, we find that text tokenization is more effective than learned projectors, and that retaining roughly 15% of visual tokens can match the performance of using the full token set. Across CALVIN, LIBERO, and real-world manipulation, ThinkProprio matches or improves over strong baselines while reducing end-to-end inference latency over 50%.

ROMar 11, 2025
Instruction-Augmented Long-Horizon Planning: Embedding Grounding Mechanisms in Embodied Mobile Manipulation

Fangyuan Wang, Shipeng Lyu, Peng Zhou et al.

Enabling humanoid robots to perform long-horizon mobile manipulation planning in real-world environments based on embodied perception and comprehension abilities has been a longstanding challenge. With the recent rise of large language models (LLMs), there has been a notable increase in the development of LLM-based planners. These approaches either utilize human-provided textual representations of the real world or heavily depend on prompt engineering to extract such representations, lacking the capability to quantitatively understand the environment, such as determining the feasibility of manipulating objects. To address these limitations, we present the Instruction-Augmented Long-Horizon Planning (IALP) system, a novel framework that employs LLMs to generate feasible and optimal actions based on real-time sensor feedback, including grounded knowledge of the environment, in a closed-loop interaction. Distinct from prior works, our approach augments user instructions into PDDL problems by leveraging both the abstract reasoning capabilities of LLMs and grounding mechanisms. By conducting various real-world long-horizon tasks, each consisting of seven distinct manipulatory skills, our results demonstrate that the IALP system can efficiently solve these tasks with an average success rate exceeding 80%. Our proposed method can operate as a high-level planner, equipping robots with substantial autonomy in unstructured environments through the utilization of multi-modal sensor inputs.

ROOct 22, 2021
Action Planning for Packing Long Linear Elastic Objects into Compact Boxes with Bimanual Robotic Manipulation

Wanyu Ma, Bin Zhang, Lijun Han et al.

In this paper, we propose a new action planning approach to automatically pack long linear elastic objects into common-size boxes with a bimanual robotic system. For that, we developed a hybrid geometric model to handle large-scale occlusions combining an online vision-based method and an offline reference template. Then, a reference point generator is introduced to automatically plan the reference poses for the predesigned action primitives. Finally, an action planner integrates these components enabling the execution of high-level behaviors and the accomplishment of packing manipulation tasks. To validate the proposed approach, we conducted a detailed experimental study with multiple types and lengths of objects and packing boxes.

ROOct 18, 2021
Keypoint-Based Bimanual Shaping of Deformable Linear Objects under Environmental Constraints using Hierarchical Action Planning

Shengzeng Huo, Anqing Duan, Chengxi Li et al.

This paper addresses the problem of contact-based manipulation of deformable linear objects (DLOs) towards desired shapes with a dual-arm robotic system. To alleviate the burden of high-dimensional continuous state-action spaces, we model the DLO as a kinematic multibody system via our proposed keypoint detection network. This new perception network is trained on a synthetic labeled image dataset and transferred to real manipulation scenarios without conducting any manual annotations. Our goal-conditioned policy can efficiently learn to rearrange the configuration of the DLO based on the detected keypoints. The proposed hierarchical action framework tackles the manipulation problem in a coarse-to-fine manner (with high-level task planning and low-level motion control) by leveraging on two action primitives. The identification of deformation properties is avoided since the algorithm replans its motion after each bimanual execution. The conducted experimental results reveal that our method achieves high performance in state representation of the DLO, and is robust to uncertain environmental constraints.

ROOct 16, 2021
Learning Cloth Folding Tasks with Refined Flow Based Spatio-Temporal Graphs

Peng Zhou, Omar Zahra, Anqing Duan et al.

Cloth folding is a widespread domestic task that is seemingly performed by humans but which is highly challenging for autonomous robots to execute due to the highly deformable nature of textiles; It is hard to engineer and learn manipulation pipelines to efficiently execute it. In this paper, we propose a new solution for robotic cloth folding (using a standard folding board) via learning from demonstrations. Our demonstration video encoding is based on a high-level abstraction, namely, a refined optical flow-based spatiotemporal graph, as opposed to a low-level encoding such as image pixels. By constructing a new spatiotemporal graph with an advanced visual corresponding descriptor, the policy learning can focus on key points and relations with a 3D spatial configuration, which allows to quickly generalize across different environments. To further boost the policy searching, we combine optical flow and static motion saliency maps to discriminate the dominant motions for better handling the system dynamics in real-time, which aligns with the attentional motion mechanism that dominates the human imitation process. To validate the proposed approach, we analyze the manual folding procedure and developed a custom-made end-effector to efficiently interact with the folding board. Multiple experiments on a real robotic platform were conducted to validate the effectiveness and robustness of the proposed method.

IVSep 11, 2021
Follow the Curve: Robotic-Ultrasound Navigation with Learning Based Localization of Spinous Processes for Scoliosis Assessment

Maria Victorova, Michael Ka-Shing Lee, David Navarro-Alarcon et al.

The scoliosis progression in adolescents requires close monitoring to timely take treatment measures. Ultrasound imaging is a radiation-free and low-cost alternative in scoliosis assessment to X-rays, which are typically used in clinical practice. However, ultrasound images are prone to speckle noises, making it challenging for sonographers to detect bony features and follow the spine's curvature. This paper introduces a robotic-ultrasound approach for spinal curvature tracking and automatic navigation. A fully connected network with deconvolutional heads is developed to locate the spinous process efficiently with real-time ultrasound images. We use this machine learning-based method to guide the motion of the robot-held ultrasound probe and follow the spinal curvature while capturing ultrasound images and correspondent position. We developed a new force-driven controller that automatically adjusts the probe's pose relative to the skin surface to ensure a good acoustic coupling between the probe and skin. After the scanning, the acquired data is used to reconstruct the coronal spinal image, where the deformity of the scoliosis spine can be assessed and measured. To evaluate the performance of our methodology, we conducted an experimental study with human subjects where the deviations from the image center during the robotized procedure are compared to that obtained from manual scanning. The angles of spinal deformity measured on spinal reconstruction images were similar for both methods, implying that they equally reflect human anatomy.

HCSep 3, 2021
A Multi-Sensor Interface to Improve the Learning Experience in Arc Welding Training Tasks

Hoi-Yin Lee, Peng Zhou, Anqing Duan et al.

This paper presents the development of a multi-sensor user interface to facilitate the instruction of arc welding tasks. Traditional methods to acquire hand-eye coordination skills are typically conducted through one-to-one instruction where trainees must wear protective helmets and conduct several tests. This approach is inefficient as the harmful light emitted from the electric arc impedes the close monitoring of the process; Practitioners can only observe a small bright spot. To tackle these problems, recent training approaches have leveraged virtual reality to safely simulate the process and visualize the geometry of the workpieces. However, the synthetic nature of these types of simulation platforms reduces their effectiveness as they fail to comprise actual welding interactions with the environment, which hinders the trainees' learning process. To provide users with a real welding experience, we have developed a new multi-sensor extended reality platform for arc welding training. Our system is composed of: (1) An HDR camera, monitoring the real welding spot in real-time; (2) A depth sensor, capturing the 3D geometry of the scene; and (3) A head-mounted VR display, visualizing the process safely. Our innovative platform provides users with a "bot trainer", virtual cues of the seam geometry, automatic spot tracking, and performance scores. To validate the platform's feasibility, we conduct extensive experiments with several welding training tasks. We show that compared with the traditional training practice and recent virtual reality approaches, our automated multi-sensor method achieves better performances in terms of accuracy, learning curve, and effectiveness.

ROAug 19, 2021
Can a Tesla Turbine be Utilised as a Non-Magnetic Actuator for MRI-Guided Robotic Interventions?

David Navarro-Alarcon, Luiza Labazanova, Man Kiu Chow et al.

This paper introduces a new type of nonmagnetic actuator for MRI interventions. Ultrasonic and piezoelectric motors are one the most commonly used actuators in MRI applications. However, most of these actuators are only MRI-safe, which means they cannot be operated while imaging as they cause significant visual artifacts. To cope with this issue, we developed a new pneumatic rotary servo-motor (based on the Tesla turbine) that can be effectively used during continuous MR imaging. We thoroughly tested the performance and magnetic properties of our MRI-compatible actuator with several experiments, both inside and outside an MRI scanner. The reported results confirm the feasibility to use this motor for MRI-guided robotic interventions.

ROAug 19, 2021
Towards a Multispectral RGB-IR-UV-D Vision System -- Seeing the Invisible in 3D

Tanhao Zhang, Luyin Hu, Lu Li et al.

In this paper, we present the development of a sensing system with the capability to compute multispectral point clouds in real-time. The proposed multi-eye sensor system effectively registers information from the visible, (long-wave) infrared, and ultraviolet spectrum to its depth sensing frame, thus enabling to measure a wider range of surface features that are otherwise hidden to the naked eye. For that, we designed a new cross-calibration apparatus that produces consistent features which can be sensed by each of the cameras, therefore, acting as a multispectral "chessboard". The performance of the sensor is evaluated with two different cases of studies, where we show that the proposed system can detect "hidden" features of a 3D environment.

ROJul 30, 2021
A Novel Approach to Model the Kinematics of Human Fingers Based on an Elliptic Multi-Joint Configuration

Zeyu Wu, Luiza Labazanova, Peng Zhou et al.

In this paper, we present a novel kinematic model of the human phalanges based on the elliptical motion of their joints. The presence of the soft elastic tissues and the general anatomical structure of the hand joints highly affect the relative movement of the bones. Commonly used assumption of circular trajectories simplifies the designing process but leads to divergence with the actual hand behavior. The advantages of the proposed model are demonstrated through the comparison with the conventional revolute joint model. Conducted simulations and experiments validate designed forward and inverse kinematic algorithms. Obtained results show a high performance of the model in mimicking the human fingertip motion trajectory.

ROJun 5, 2021
A Split-face Study of Novel Robotic Prototype vs Human Operator in Skin Rejuvenation Using Q-switched Nd:Yag Laser: Accuracy, Efficacy and Safety

Si Un Chan, Cheong Cheong Ip, Chengxiang Lian et al.

Background: Robotic technologies involved in skin laser are emerging. Objective: To compare the accuracy, efficacy and safety of novel robotic prototype with human operator in laser operation performance for skin photo-rejuvenation. Methods: Seventeen subjects were enrolled in a prospective, comparative split-face trial. Q-switch 1064nm laser conducted by the robotic prototype was provided on the right side of the face and that by the professional practitioner on the left. Each subject underwent a single time, one-pass, non-overlapped treatment on an equal size area of the forehead and cheek. Objective assessments included: treatment duration, laser irradiation shots, laser coverage percentage, VISIA parameters, skin temperature and the VAS pain scale. Results: Average time taken by robotic manipulator was longer than human operator; the average number of irradiation shots of both sides had no significant differences. Laser coverage rate of robotic manipulator (60.2 +-15.1%) was greater than that of human operator (43.6 +-12.9%). The VISIA parameters showed no significant differences between robotic manipulator and human operator. No short or long-term side effects were observed with maximum VAS score of 1 point. Limitations: Only one section of laser treatment was performed. Conclusion: Laser operation by novel robotic prototype is more reliable, stable and accurate than human operation.

ROJun 4, 2021
Contour Moments Based Manipulation of Composite Rigid-Deformable Objects with Finite Time Model Estimation and Shape/Position Control

Jiaming Qi, Guangfu Ma, Jihong Zhu et al.

The robotic manipulation of composite rigid-deformable objects (i.e. those with mixed non-homogeneous stiffness properties) is a challenging problem with clear practical applications that, despite the recent progress in the field, it has not been sufficiently studied in the literature. To deal with this issue, in this paper we propose a new visual servoing method that has the capability to manipulate this broad class of objects (which varies from soft to rigid) with the same adaptive strategy. To quantify the object's infinite-dimensional configuration, our new approach computes a compact feedback vector of 2D contour moments features. A sliding mode control scheme is then designed to simultaneously ensure the finite-time convergence of both the feedback shape error and the model estimation error. The stability of the proposed framework (including the boundedness of all the signals) is rigorously proved with Lyapunov theory. Detailed simulations and experiments are presented to validate the effectiveness of the proposed approach. To the best of the author's knowledge, this is the first time that contour moments along with finite-time control have been used to solve this difficult manipulation problem.

ROMay 4, 2021
Challenges and Outlook in Robotic Manipulation of Deformable Objects

Jihong Zhu, Andrea Cherubini, Claire Dune et al.

Deformable object manipulation (DOM) is an emerging research problem in robotics. The ability to manipulate deformable objects endows robots with higher autonomy and promises new applications in the industrial, services, and healthcare sectors. However, compared to rigid object manipulation, the manipulation of deformable objects is considerably more complex, and is still an open research problem. Addressing DOM challenges demand breakthroughs in almost all aspects of robotics, namely hardware design, sensing, (deformation) modeling, planning, and control. In this article, we review recent advances and highlight the main challenges when considering deformation in each sub-field. A particular focus of our paper lies in the discussions of these challenges and proposing future directions of research.

ROMar 17, 2021
Bio-Inspired Design of Artificial Striated Muscles Composed of Sarcomere-Like Contraction Units (preprint)

Luiza Labazanova, Zeyu Wu, Zhengping Gu et al.

Biological muscles have always attracted robotics researchers due to their efficient capabilities in compliance, force generation, and mechanical work. Many groups are working on the development of artificial muscles, however, state-of-the-art methods still fall short in performance when compared with their biological counterpart. Muscles with high force output are mostly rigid, whereas traditional soft actuators take much space and are limited in strength and producing displacement. In this work, we aim to find a reasonable trade-off between these features by mimicking the striated structure of skeletal muscles. For that, we designed an artificial pneumatic myofibril composed of multiple contraction units that combine stretchable and inextensible materials. Varying the geometric parameters and the number of units in series provides flexible adjustment of the desired muscle operation. We derived a mathematical model that predicts the relationship between the input pneumatic pressure and the generated output force. A detailed experimental study is conducted to validate the performance of the proposed bio-inspired muscle.

ROFeb 3, 2021
A Neurorobotic Embodiment for Exploring the Dynamical Interactions of a Spiking Cerebellar Model and a Robot Arm During Vision-based Manipulation Tasks

Omar Zahra, David Navarro-Alarcon, Silvia Tolu

While the original goal for developing robots is replacing humans in dangerous and tedious tasks, the final target shall be completely mimicking the human cognitive and motor behaviour. Hence, building detailed computational models for the human brain is one of the reasonable ways to attain this. The cerebellum is one of the key players in our neural system to guarantee dexterous manipulation and coordinated movements as concluded from lesions in that region. Studies suggest that it acts as a forward model providing anticipatory corrections for the sensory signals based on observed discrepancies from the reference values. While most studies consider providing the teaching signal as error in joint-space, few studies consider the error in task-space and even fewer consider the spiking nature of the cerebellum on the cellular-level. In this study, a detailed cellular-level forward cerebellar model is developed, including modeling of Golgi and Basket cells which are usually neglected in previous studies. To preserve the biological features of the cerebellum in the developed model, a hyperparameter optimization method tunes the network accordingly. The efficiency and biological plausibility of the proposed cerebellar-based controller is then demonstrated under different robotic manipulation tasks reproducing motor behaviour observed in human reaching experiments.

ROJan 19, 2021
Towards Latent Space Based Manipulation of Elastic Rods using Autoencoder Models and Robust Centerline Extractions

Jiaming Qi, Guangfu Ma, Peng Zhou et al.

The automatic shape control of deformable objects is a challenging (and currently hot) manipulation problem due to their high-dimensional geometric features and complex physical properties. In this study, a new methodology to manipulate elastic rods automatically into 2D desired shapes is presented. An efficient vision-based controller that uses a deep autoencoder network is designed to compute a compact representation of the object's infinite-dimensional shape. An online algorithm that approximates the sensorimotor mapping between the robot's configuration and the object's shape features is used to deal with the latter's (typically unknown) mechanical properties. The proposed approach computes the rod's centerline from raw visual data in real-time by introducing an adaptive algorithm on the basis of a self-organizing network. Its effectiveness is thoroughly validated with simulations and experiments.

ROJan 6, 2021
Shape Control of Elastic Objects Based on Implicit Sensorimotor Models and Data-Driven Geometric Features

Wanyu Ma, Jihong Zhu, David Navarro-Alarcon

This paper proposes a general approach to design automatic controls to manipulate elastic objects into desired shapes. The object's geometric model is defined as the shape feature based on the specific task to globally describe the deformation. Raw visual feedback data is processed using classic regression methods to identify parameters of data-driven geometric models in real-time. Our proposed method is able to analytically compute a pose-shape Jacobian matrix based on implicit functions. This model is then used to derive a shape servoing controller. To validate the proposed method, we report a detailed experimental study with robotic manipulators deforming an elastic rod.

RODec 24, 2020
On Radiation-Based Thermal Servoing: New Models, Controls and Experiments

Luyin Hu, David Navarro-Alarcon, Andrea Cherubini et al.

In this paper, we introduce a new sensor-based control method that regulates (by means of robot motions) the heat transfer between a radiative source and an object of interest. This valuable sensorimotor capability is needed in many industrial, dermatology and field robot applications, and it is an essential component for creating machines with advanced thermo-motor intelligence. To this end, we derive a geometric-thermal-motor model which describes the relationship between the robot's active configuration and the produced dynamic thermal response. We then use the model to guide the design of two new thermal servoing controllers (one model-based and one adaptive), and analyze their stability with Lyapunov theory. To validate our method, we report a detailed experimental study with a robotic manipulator conducting autonomous thermal servoing tasks. To the best of the authors' knowledge, this is the first time that temperature regulation has been formulated as a motion control problem for robots.

RODec 10, 2020
LaSeSOM: A Latent and Semantic Representation Framework for Soft Object Manipulation

Peng Zhou, Jihong Zhu, Shengzeng Huo et al.

Soft object manipulation has recently gained popularity within the robotics community due to its potential applications in many economically important areas. Although great progress has been recently achieved in these types of tasks, most state-of-the-art methods are case-specific; They can only be used to perform a single deformation task (e.g. bending), as their shape representation algorithms typically rely on "hard-coded" features. In this paper, we present LaSeSOM, a new feedback latent representation framework for semantic soft object manipulation. Our new method introduces internal latent representation layers between low-level geometric feature extraction and high-level semantic shape analysis; This allows the identification of each compressed semantic function and the formation of a valid shape classifier from different feature extraction levels. The proposed latent framework makes soft object representation more generic (independent from the object's geometry and its mechanical properties) and scalable (it can work with 1D/2D/3D tasks). Its high-level semantic layer enables to perform (quasi) shape planning tasks with soft objects, a valuable and underexplored capability in many soft manipulation tasks. To validate this new methodology, we report a detailed experimental study with robotic manipulators.

RONov 24, 2020
Path Planning with Automatic Seam Extraction over Point Cloud Models for Robotic Arc Welding

Peng Zhou, Rui Peng, Maggie Xu et al.

This paper presents a point cloud based robotic system for arc welding. Using hand gesture controls, the system scans partial point cloud views of workpiece and reconstructs them into a complete 3D model by a linear iterative closest point algorithm. Then, a bilateral filter is extended to denoise the workpiece model and preserve important geometrical information. To extract the welding seam from the model, a novel intensity-based algorithm is proposed that detects edge points and generates a smooth 6-DOF welding path. The methods are tested on multiple workpieces with different joint types and poses. Experimental results prove the robustness and efficiency of this robotic system on automatic path planning for welding applications.

RONov 3, 2020
Vision-Based Control for Robots by a Fully Spiking Neural System Relying on Cerebellar Predictive Learning

Omar Zahra, David Navarro-Alarcon, Silvia Tolu

The cerebellum plays a distinctive role within our motor control system to achieve fine and coordinated motions. While cerebellar lesions do not lead to a complete loss of motor functions, both action and perception are severally impacted. Hence, it is assumed that the cerebellum uses an internal forward model to provide anticipatory signals by learning from the error in sensory states. In some studies, it was demonstrated that the learning process relies on the joint-space error. However, this may not exist. This work proposes a novel fully spiking neural system that relies on a forward predictive learning by means of a cellular cerebellar model. The forward model is learnt thanks to the sensory feedback in task-space and it acts as a Smith predictor. The latter predicts sensory corrections in input to a differential mapping spiking neural network during a visual servoing task of a robot arm manipulator. In this paper, we promote the developed control system to achieve more accurate target reaching actions and reduce the motion execution time for the robotic reaching tasks thanks to the cerebellar predictive capabilities.

ROAug 25, 2020
A Robotic Line Scan System with Adaptive ROI for Inspection of Defects over Convex Free-form Specular Surfaces

Shengzeng Huo, David Navarro-Alarcon, David Chik

In this paper, we present a new robotic system to perform defect inspection tasks over free-form specular surfaces. The autonomous procedure is achieved by a six-DOF manipulator, equipped with a line scan camera and a high-intensity lighting system. Our method first uses the object's CAD mesh model to implement a K-means unsupervised learning algorithm that segments the object's surface into areas with similar curvature. Then, the scanning path is computed by using an adaptive algorithm that adjusts the camera's ROI to observe regions with irregular shapes properly. A novel iterative closest point-based projection registration method that robustly localizes the object in the robot's coordinate frame system is proposed to deal with the blind spot problem of specular objects captured by depth sensors. Finally, an image processing pipeline automatically detects surface defects in the captured high-resolution images. A detailed experimental study with a vision-guided robotic scanning system is reported to validate the proposed methodology.

ROAug 16, 2020
Adaptive Shape Servoing of Elastic Rods using Parameterized Regression Features and Auto-Tuning Motion Controls

Jiaming Qi, Guangtao Ran, Bohui Wang et al.

The robotic manipulation of deformable linear objects has shown great potential in a wide range of real-world applications. However, it presents many challenges due to the objects' complex nonlinearity and high-dimensional configuration. In this paper, we propose a new shape servoing framework to automatically manipulate elastic rods through visual feedback. Our new method uses parameterized regression features to compute a compact (low-dimensional) feature vector that quantifies the object's shape, thus, enabling to establish an explicit shape servo-loop. To automatically deform the rod into a desired shape, the proposed adaptive controller iteratively estimates the differential transformation between the robot's motion and the relative shape changes; This valuable capability allows to effectively manipulate objects with unknown mechanical models. An auto-tuning algorithm is introduced to adjust the robot's shaping motions in real-time based on optimal performance criteria. To validate the proposed framework, a detailed experimental study with vision-guided robotic manipulators is presented.

ROJul 4, 2020
Sensor-Based Control for Collaborative Robots: Fundamentals, Challenges and Opportunities

Andrea Cherubini, David Navarro-Alarcon

The objective of this paper is to present a systematic review of existing sensor-based control methodologies for applications that involve direct interaction between humans and robots, in the form of either physical collaboration or safe coexistence. To this end, we first introduce the basic formulation of the sensor-servo problem, then present the most common approaches: vision-based, touch-based, audio-based, and distance-based control. Afterwards, we discuss and formalize the methods that integrate heterogeneous sensors at the control level. The surveyed body of literature is classified according to the type of sensor, to the way multiple measurements are combined, and to the target objectives and applications. Finally, we discuss open problems, potential applications, and future research directions.

ROJun 16, 2020
Vision-based Manipulation of Deformable and Rigid Objects Using Subspace Projections of 2D Contours

Jihong Zhu, David Navarro-Alarcon, Robin Passama et al.

This paper proposes a unified vision-based manipulation framework using image contours of deformable/rigid objects. Instead of using human-defined cues, the robot automatically learns the features from processed vision data. Our method simultaneously generates -- from the same data -- both, visual features and the interaction matrix that relates them to the robot control inputs. Extraction of the feature vector and control commands is done online and adaptively, with little data for initialization. The method allows the robot to manipulate an object without knowing whether it is rigid or deformable. To validate our approach, we conduct numerical simulations and experiments with both deformable and rigid objects.

ROMay 21, 2020
Robotics Meets Cosmetic Dermatology: Development of a Novel Vision-Guided System for Skin Photo-Rejuvenation

Muhammad Muddassir, Domingo Gomez, Shujian Chen et al.

In this paper, we present a novel robotic system for skin photo-rejuvenation procedures, which can uniformly deliver the laser's energy over the skin of the face. The robotised procedure is performed by a manipulator whose end-effector is instrumented with a depth sensor, a thermal camera, and a cosmetic laser generator. To plan the heat stimulating trajectories for the laser, the system computes the surface model of the face and segments it into seven regions that are automatically filled with laser shots. We report experimental results with human subjects to validate the performance of the system. To the best of the author's knowledge, this is the first time that facial skin rejuvenation has been automated by robot manipulators.

ROMay 20, 2020
Differential Mapping Spiking Neural Network for Sensor-Based Robot Control

Omar Zahra, Silvia Tolu, David Navarro-Alarcon

In this work, a spiking neural network (SNN) is proposed for approximating differential sensorimotor maps of robotic systems. The computed model is used as a local Jacobian-like projection that relates changes in sensor space to changes in motor space. The SNN consists of an input (sensory) layer and an output (motor) layer connected through plastic synapses, with inter-inhibitory connections at the output layer. Spiking neurons are modeled as Izhikevich neurons with a synaptic learning rule based on spike-timing-dependent plasticity. Feedback data from proprioceptive and exteroceptive sensors are encoded and fed into the input layer through a motor babbling process. As the main challenge to building an efficient SNN is to tune its parameters, we present an intuitive tuning method that considerably reduces the number of neurons and the amount of data required for training. Our proposed architecture represents a biologically plausible neural controller that is capable of handling noisy sensor readings to guide robot movements in real-time. Experimental results are presented to validate the control methodology with a vision-guided robot.

ROApr 26, 2020
A Point Cloud-Based Method for Automatic Groove Detection and Trajectory Generation of Robotic Arc Welding Tasks

Rui Peng, David Navarro-Alarcon, Victor Wu et al.

In this paper, in order to pursue high-efficiency robotic arc welding tasks, we propose a method based on point cloud acquired by an RGB-D sensor. The method consists of two parts: welding groove detection and 3D welding trajectory generation. The actual welding scene could be displayed in 3D point cloud format. Focusing on the geometric feature of the welding groove, the detection algorithm is capable of adapting well to different welding workpieces with a V-type welding groove. Meanwhile, a 3D welding trajectory involving 6-DOF poses of the welding groove for robotic manipulator motion is generated. With an acceptable error in trajectory generation, the robotic manipulator could drive the welding torch to follow the trajectory and execute welding tasks. In this paper, details of the integrated robotic system are also presented. Experimental results prove application value of the presented welding robotic system.

ROApr 25, 2020
A Lyapunov-Stable Adaptive Method to Approximate Sensorimotor Models for Sensor-Based Control

David Navarro-Alarcon, Jiaming Qi, Jihong Zhu et al.

In this article, we present a new scheme that approximates unknown sensorimotor models of robots by using feedback signals only. The formulation of the uncalibrated sensor-based regulation problem is first formulated, then, we develop a computational method that distributes the model estimation problem amongst multiple adaptive units that specialise in a local sensorimotor map. Different from traditional estimation algorithms, the proposed method requires little data to train and constrain it (the number of required data points can be analytically determined) and has rigorous stability properties (the conditions to satisfy Lyapunov stability are derived). Numerical simulations and experimental results are presented to validate the proposed method.

IVFeb 26, 2020
Force-Ultrasound Fusion: Bringing Spine Robotic-US to the Next "Level"

Maria Tirindelli, Maria Victorova, Javier Esteban et al.

Spine injections are commonly performed in several clinical procedures. The localization of the target vertebral level (i.e. the position of a vertebra in a spine) is typically done by back palpation or under X-ray guidance, yielding either higher chances of procedure failure or exposure to ionizing radiation. Preliminary studies have been conducted in the literature, suggesting that ultrasound imaging may be a precise and safe alternative to X-ray for spine level detection. However, ultrasound data are noisy and complicated to interpret. In this study, a robotic-ultrasound approach for automatic vertebral level detection is introduced. The method relies on the fusion of ultrasound and force data, thus providing both "tactile" and visual feedback during the procedure, which results in higher performances in presence of data corruption. A robotic arm automatically scans the volunteer's back along the spine by using force-ultrasound data to locate vertebral levels. The occurrences of vertebral levels are visible on the force trace as peaks, which are enhanced by properly controlling the force applied by the robot on the patient back. Ultrasound data are processed with a Deep Learning method to extract a 1D signal modelling the probabilities of having a vertebra at each location along the spine. Processed force and ultrasound data are fused using a 1D Convolutional Network to compute the location of the vertebral levels. The method is compared to pure image and pure force-based methods for vertebral level counting, showing improved performance. In particular, the fusion method is able to correctly classify 100% of the vertebral levels in the test set, while pure image and pure force-based method could only classify 80% and 90% vertebrae, respectively. The potential of the proposed method is evaluated in an exemplary simulated clinical application.

ROMay 1, 2019
A Self-Organizing Network with Varying Density Structure for Characterizing Sensorimotor Transformations in Robotic Systems

Omar Zahra, David Navarro-Alarcon

In this work, we present the development of a neuro-inspired approach for characterizing sensorimotor relations in robotic systems. The proposed method has self-organizing and associative properties that enable it to autonomously obtain these relations without any prior knowledge of either the motor (e.g. mechanical structure) or perceptual (e.g. sensor calibration) models. Self-organizing topographic properties are used to build both sensory and motor maps, then the associative properties rule the stability and accuracy of the emerging connections between these maps. Compared to previous works, our method introduces a new varying density self-organizing map (VDSOM) that controls the concentration of nodes in regions with large transformation errors without affecting much the computational time. A distortion metric is measured to achieve a self-tuning sensorimotor model that adapts to changes in either motor or sensory models. The obtained sensorimotor maps prove to have less error than conventional self-organizing methods and potential for further development.

ROApr 13, 2019
On Model Adaptation for Sensorimotor Control of Robots

David Navarro-Alarcon, Andrea Cherubini, Xiang Li

In this article, we address the problem of computing adaptive sensorimotor models that can be used for guiding the motion of robotic systems with uncertain action-to-perception relations. The formulation of the uncalibrated sensor-based control problem is first presented, then, various computational methods for building adaptive sensorimotor models are derived and analysed. The proposed methodology is exemplified with two cases of study: (i) shape control of deformable objects with unknown properties, and (ii) soft manipulation of ultrasonic probes with uncalibrated sensors.