Marcelo Ang

5papers

762citations

Novelty47%

AI Score26

Ranked #165,470 of 205,806 authors (top 80%)#50,369 in CV (top 85%)

5 Papers

ROJun 6, 2022

Real2Sim or Sim2Real: Robotics Visual Insertion using Deep Reinforcement Learning and Real2Sim Policy Adaptation

Yiwen Chen, Xue Li, Sheng Guo et al.

Reinforcement learning has shown a wide usage in robotics tasks, such as insertion and grasping. However, without a practical sim2real strategy, the policy trained in simulation could fail on the real task. There are also wide researches in the sim2real strategies, but most of those methods rely on heavy image rendering, domain randomization training, or tuning. In this work, we solve the insertion task using a pure visual reinforcement learning solution with minimum infrastructure requirement. We also propose a novel sim2real strategy, Real2Sim, which provides a novel and easier solution in policy adaptation. We discuss the advantage of Real2Sim compared with Sim2Real.

AIMar 1, 2022

A Versatile Agent for Fast Learning from Human Instructors

Yiwen Chen, Zedong Zhang, Haofeng Liu et al.

In recent years, a myriad of superlative works on intelligent robotics policies have been done, thanks to advances in machine learning. However, inefficiency and lack of transfer ability hindered algorithms from pragmatic applications, especially in human-robot collaboration, when few-shot fast learning and high flexibility become a wherewithal. To surmount this obstacle, we refer to a "Policy Pool", containing pre-trained skills that can be easily accessed and reused. An agent is employed to govern the "Policy Pool" by unfolding requisite skills in a flexible sequence, contingent on task specific predilection. This predilection can be automatically interpreted from one or few human expert demonstrations. Under this hierarchical setting, our algorithm is able to pick up a sparse-reward, multi-stage knack with only one demonstration in a Mini-Grid environment, showing the potential for instantly mastering complex robotics skills from human instructors. Additionally, the innate quality of our algorithm also allows for lifelong learning, making it a versatile agent.

CVApr 1, 2021

Self-supervised Motion Learning from Static Images

Ziyuan Huang, Shiwei Zhang, Jianwen Jiang et al.

Motions are reflected in videos as the movement of pixels, and actions are essentially patterns of inconsistent motions between the foreground and the background. To well distinguish the actions, especially those with complicated spatio-temporal interactions, correctly locating the prominent motion areas is of crucial importance. However, most motion information in existing videos are difficult to label and training a model with good motion representations with supervision will thus require a large amount of human labour for annotation. In this paper, we address this problem by self-supervised learning. Specifically, we propose to learn Motion from Static Images (MoSI). The model learns to encode motion information by classifying pseudo motions generated by MoSI. We furthermore introduce a static mask in pseudo motions to create local motion patterns, which forces the model to additionally locate notable motion areas for the correct classification.We demonstrate that MoSI can discover regions with large motion even without fine-tuning on the downstream datasets. As a result, the learned motion representations boost the performance of tasks requiring understanding of complex scenes and motions, i.e., action recognition. Extensive experiments show the consistent and transferable improvements achieved by MoSI. Codes will be soon released.

CVApr 22, 2019

2D3D-MatchNet: Learning to Match Keypoints Across 2D Image and 3D Point Cloud

Mengdan Feng, Sixing Hu, Marcelo Ang et al.

Large-scale point cloud generated from 3D sensors is more accurate than its image-based counterpart. However, it is seldom used in visual pose estimation due to the difficulty in obtaining 2D-3D image to point cloud correspondences. In this paper, we propose the 2D3D-MatchNet - an end-to-end deep network architecture to jointly learn the descriptors for 2D and 3D keypoint from image and point cloud, respectively. As a result, we are able to directly match and establish 2D-3D correspondences from the query image and 3D point cloud reference map for visual pose estimation. We create our Oxford 2D-3D Patches dataset from the Oxford Robotcar dataset with the ground truth camera poses and 2D-3D image to point cloud correspondences for training and testing the deep network. Experimental results verify the feasibility of our approach.

RONov 2, 2013

Why robots? A survey on the roles and benefits of social robots in the therapy of children with autism

John-John Cabibihan, Hifza Javed, Marcelo Ang et al.

This paper reviews the use of socially interactive robots to assist in the therapy of children with autism. The extent to which the robots were successful in helping the children in their social, emotional, and communication deficits was investigated. Child-robot interactions were scrutinized with respect to the different target behaviors that are to be elicited from a child during therapy. These behaviors were thoroughly examined with respect to a childs development needs. Most importantly, experimental data from the surveyed works were extracted and analyzed in terms of the target behaviors and how each robot was used during a therapy session to achieve these behaviors. The study concludes by categorizing the different therapeutic roles that these robots were observed to play, and highlights the important design features that enable them to achieve high levels of effectiveness in autism therapy.