ROSep 22, 2021
Real Robot Challenge: A Robotics Competition in the CloudStefan Bauer, Felix Widmaier, Manuel Wüthrich et al.
Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects.
LGAug 16, 2021
On the Opportunities and Risks of Foundation ModelsRishi Bommasani, Drew A. Hudson, Ehsan Adeli et al.
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.
ROJul 6, 2021
Learning Latent Actions to Control Assistive RobotsDylan P. Losey, Hong Jun Jeon, Mengxi Li et al.
Assistive robot arms enable people with disabilities to conduct everyday tasks on their own. These arms are dexterous and high-dimensional; however, the interfaces people must use to control their robots are low-dimensional. Consider teleoperating a 7-DoF robot arm with a 2-DoF joystick. The robot is helping you eat dinner, and currently you want to cut a piece of tofu. Today's robots assume a pre-defined mapping between joystick inputs and robot actions: in one mode the joystick controls the robot's motion in the x-y plane, in another mode the joystick controls the robot's z-yaw motion, and so on. But this mapping misses out on the task you are trying to perform! Ideally, one joystick axis should control how the robot stabs the tofu and the other axis should control different cutting motions. Our insight is that we can achieve intuitive, user-friendly control of assistive robots by embedding the robot's high-dimensional actions into low-dimensional and human-controllable latent actions. We divide this process into three parts. First, we explore models for learning latent actions from offline task demonstrations, and formalize the properties that latent actions should satisfy. Next, we combine learned latent actions with autonomous robot assistance to help the user reach and maintain their high-level goals. Finally, we learn a personalized alignment model between joystick inputs and latent actions. We evaluate our resulting approach in four user studies where non-disabled participants reach marshmallows, cook apple pie, cut tofu, and assemble dessert. We then test our approach with two disabled adults who leverage assistive devices on a daily basis.
ROJan 27, 2021
Dexterous Manipulation Primitives for the Real Robot ChallengeClaire Chen, Krishnan Srinivasan, Jeffrey Zhang et al.
This report describes our approach for Phase 3 of the Real Robot Challenge. To solve cuboid manipulation tasks of varying difficulty, we decompose each task into the following primitives: moving the fingers to the cuboid to grasp it, turning it on the table to minimize orientation error, and re-positioning it to the goal position. We use model-based trajectory optimization and control to plan and execute these primitives. These grasping, turning, and re-positioning primitives are sequenced with a state-machine that determines which primitive to execute given the current object state and goal. Our method shows robust performance over multiple runs with randomized initial and goal positions. With this approach, our team placed second in the challenge, under the anonymous name "sombertortoise" on the leaderboard. Example runs of our method solving each of the four levels can be seen in this video (https://www.youtube.com/watch?v=I65Kwu9PGmg&list=PLt9QxrtaftrHGXcp4Oh8-s_OnQnBnLtei&index=1).
LGOct 29, 2020
Recovery RL: Safe Reinforcement Learning with Learned Recovery ZonesBrijen Thananjeyan, Ashwin Balakrishna, Suraj Nair et al.
Safety remains a central obstacle preventing widespread use of RL in the real world: learning new tasks in uncertain environments requires extensive exploration, but safety requires limiting exploration. We propose Recovery RL, an algorithm which navigates this tradeoff by (1) leveraging offline data to learn about constraint violating zones before policy learning and (2) separating the goals of improving task performance and constraint satisfaction across two policies: a task policy that only optimizes the task reward and a recovery policy that guides the agent to safety when constraint violation is likely. We evaluate Recovery RL on 6 simulation domains, including two contact-rich manipulation tasks and an image-based navigation task, and an image-based obstacle avoidance task on a physical robot. We compare Recovery RL to 5 prior safe RL methods which jointly optimize for task performance and safety via constrained optimization or reward shaping and find that Recovery RL outperforms the next best prior method across all domains. Results suggest that Recovery RL trades off constraint violations and task successes 2 - 20 times more efficiently in simulation domains and 3 times more efficiently in physical experiments. See https://tinyurl.com/rl-recovery for videos and supplementary material.
LGOct 27, 2020
Learning to be Safe: Deep RL with a Safety CriticKrishnan Srinivasan, Benjamin Eysenbach, Sehoon Ha et al.
Safety is an essential component for deploying reinforcement learning (RL) algorithms in real-world scenarios, and is critical during the learning process itself. A natural first approach toward safe RL is to manually specify constraints on the policy's behavior. However, just as learning has enabled progress in large-scale development of AI systems, learning safety specifications may also be necessary to ensure safety in messy open-world environments where manual safety specifications cannot scale. Akin to how humans learn incrementally starting in child-safe environments, we propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors when learning new, modified tasks. We empirically study this form of safety-constrained transfer learning in three challenging domains: simulated navigation, quadruped locomotion, and dexterous in-hand manipulation. In comparison to standard deep RL techniques and prior approaches to safe RL, we find that our method enables the learning of new tasks and in new environments with both substantially fewer safety incidents, such as falling or dropping an object, and faster, more stable learning. This suggests a path forward not only for safer RL systems, but also for more effective RL systems.
ROOct 24, 2019
Learning Hierarchical Control for Robust In-Hand ManipulationTingguang Li, Krishnan Srinivasan, Max Qing-Hu Meng et al.
Robotic in-hand manipulation has been a long-standing challenge due to the complexity of modelling hand and object in contact and of coordinating finger motion for complex manipulation sequences. To address these challenges, the majority of prior work has either focused on model-based, low-level controllers or on model-free deep reinforcement learning that each have their own limitations. We propose a hierarchical method that relies on traditional, model-based controllers on the low-level and learned policies on the mid-level. The low-level controllers can robustly execute different manipulation primitives (reposing, sliding, flipping). The mid-level policy orchestrates these primitives. We extensively evaluate our approach in simulation with a 3-fingered hand that controls three degrees of freedom of elongated objects. We show that our approach can move objects between almost all the possible poses in the workspace while keeping them firmly grasped. We also show that our approach is robust to inaccuracies in the object models and to observation noise. Finally, we show how our approach generalizes to objects of other shapes.
ROSep 20, 2019
Controlling Assistive Robots with Learned Latent ActionsDylan P. Losey, Krishnan Srinivasan, Ajay Mandlekar et al.
Assistive robotic arms enable users with physical disabilities to perform everyday tasks without relying on a caregiver. Unfortunately, the very dexterity that makes these arms useful also makes them challenging to teleoperate: the robot has more degrees-of-freedom than the human can directly coordinate with a handheld joystick. Our insight is that we can make assistive robots easier for humans to control by leveraging latent actions. Latent actions provide a low-dimensional embedding of high-dimensional robot behavior: for example, one latent dimension might guide the assistive arm along a pouring motion. In this paper, we design a teleoperation algorithm for assistive robots that learns latent actions from task demonstrations. We formulate the controllability, consistency, and scaling properties that user-friendly latent actions should have, and evaluate how different low-dimensional embeddings capture these properties. Finally, we conduct two user studies on a robotic arm to compare our latent action approach to both state-of-the-art shared autonomy baselines and a teleoperation strategy currently used by assistive arms. Participants completed assistive eating and cooking tasks more efficiently when leveraging our latent actions, and also subjectively reported that latent actions made the task easier to perform. The video accompanying this paper can be found at: https://youtu.be/wjnhrzugBj4.
ROJul 28, 2019
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich TasksMichelle A. Lee, Yuke Zhu, Peter Zachares et al.
Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is non-trivial to manually design a robot controller that combines these modalities which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. In this work, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.
ROOct 24, 2018
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich TasksMichelle A. Lee, Yuke Zhu, Krishnan Srinivasan et al.
Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. However, it is non-trivial to manually design a robot controller that combines modalities with very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. We evaluate our method on a peg insertion task, generalizing over different geometry, configurations, and clearances, while being robust to external perturbations. Results for simulated and real robot experiments are presented.
CLJun 20, 2017
Graph-based Neural Multi-Document SummarizationMichihiro Yasunaga, Rui Zhang, Kshitijh Meelu et al.
We propose a neural multi-document summarization (MDS) system that incorporates sentence relation graphs. We employ a Graph Convolutional Network (GCN) on the relation graphs, with sentence embeddings obtained from Recurrent Neural Networks as input node features. Through multiple layer-wise propagation, the GCN generates high-level hidden sentence features for salience estimation. We then use a greedy heuristic to extract salient sentences while avoiding redundancy. In our experiments on DUC 2004, we consider three types of sentence relation graphs and demonstrate the advantage of combining sentence relations in graphs with the representation power of deep neural networks. Our model improves upon traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems.