Justus Piater

RO
h-index35
32papers
419citations
Novelty38%
AI Score44

32 Papers

66.5ROMay 11Code
Scalable and Efficient Continual Learning from Demonstration via a Hypernetwork-generated Stable Dynamics Model

Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano et al.

Robots capable of learning from demonstration (LfD) must exhibit stability while executing learned motion skills. To be effective in the real world, they should also remember multiple skills over time -- a capability lacking in current stable-LfD methods. We propose an approach to stable, continual LfD, and highlight the role of stability in improving continual learning. Our proposed hypernetwork generates the parameters of two neural networks: a trajectory learning dynamics model, and a trajectory-stabilizing Lyapunov function. These generated networks form a clock-augmented stable neural ODE solver (sNODE), a stable dynamics model that offers a superior stability-accuracy trade-off compared to the state-of-the-art. We further propose stochastic hypernetwork regularization with a single, uniformly-sampled task embedding, reducing the cumulative training time for $N$ tasks from O($N^2$) to O($N$) without degrading performance on real-world tasks. We introduce high-dimensional variants of the popular LASA dataset to assess scalability and extend a dataset of robotic LfD tasks to assess real-world performance. We empirically evaluate our approach on multiple LfD datasets of varying complexity, including sequences of 7--26 tasks, trajectories of 2--32 dimensions, and real-world tasks involving position and orientation. Our thorough evaluation on multiple LfD datasets demonstrates that our approach sequentially learns and retains multiple motion skills without retraining on past demonstrations, and outperforms other relevant baselines in terms of trajectory errors, continual learning scores, and stability metrics. Notably, we show that stability greatly enhances continual learning performance, particularly in size-efficient chunked hypernetworks. Our code is available at https://github.com/sayantanauddy/clfd-snode.

LGJun 8, 2022
Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano et al.

Many Deep Reinforcement Learning (D-RL) algorithms rely on simple forms of exploration such as the additive action noise often used in continuous control domains. Typically, the scaling factor of this action noise is chosen as a hyper-parameter and is kept constant during training. In this paper, we focus on action noise in off-policy deep reinforcement learning for continuous control. We analyze how the learned policy is impacted by the noise type, noise scale, and impact scaling factor reduction schedule. We consider the two most prominent types of action noise, Gaussian and Ornstein-Uhlenbeck noise, and perform a vast experimental campaign by systematically varying the noise type and scale parameter, and by measuring variables of interest like the expected return of the policy and the state-space coverage during exploration. For the latter, we propose a novel state-space coverage measure $\operatorname{X}_{\mathcal{U}\text{rel}}$ that is more robust to estimation artifacts caused by points close to the state-space boundary than previously-proposed measures. Larger noise scales generally increase state-space coverage. However, we found that increasing the space coverage using a larger noise scale is often not beneficial. On the contrary, reducing the noise scale over the training process reduces the variance and generally improves the learning performance. We conclude that the best noise type and scale are environment dependent, and based on our observations derive heuristic rules for guiding the choice of the action noise as a starting point for further optimization.

ROAug 27, 2024
Continual Domain Randomization

Josip Josifovski, Sayantan Auddy, Mohammadhossein Malmir et al.

Domain Randomization (DR) is commonly used for sim2real transfer of reinforcement learning (RL) policies in robotics. Most DR approaches require a simulator with a fixed set of tunable parameters from the start of the training, from which the parameters are randomized simultaneously to train a robust model for use in the real world. However, the combined randomization of many parameters increases the task difficulty and might result in sub-optimal policies. To address this problem and to provide a more flexible training process, we propose Continual Domain Randomization (CDR) for RL that combines domain randomization with continual learning to enable sequential training in simulation on a subset of randomization parameters at a time. Starting from a model trained in a non-randomized simulation where the task is easier to solve, the model is trained on a sequence of randomizations, and continual learning is employed to remember the effects of previous randomizations. Our robotic reaching and grasping tasks experiments show that the model trained in this fashion learns effectively in simulation and performs robustly on the real robot while matching or outperforming baselines that employ combined randomization or sequential randomization without continual learning. Our code and videos are available at https://continual-dr.github.io/.

LGAug 1, 2022
Improving the Trainability of Deep Neural Networks through Layerwise Batch-Entropy Regularization

David Peer, Bart Keulen, Sebastian Stabinger et al.

Training deep neural networks is a very demanding task, especially challenging is how to adapt architectures to improve the performance of trained models. We can find that sometimes, shallow networks generalize better than deep networks, and the addition of more layers results in higher training and test errors. The deep residual learning framework addresses this degradation problem by adding skip connections to several neural network layers. It would at first seem counter-intuitive that such skip connections are needed to train deep networks successfully as the expressivity of a network would grow exponentially with depth. In this paper, we first analyze the flow of information through neural networks. We introduce and evaluate the batch-entropy which quantifies the flow of information through each layer of a neural network. We prove empirically and theoretically that a positive batch-entropy is required for gradient descent-based training approaches to optimize a given loss function successfully. Based on those insights, we introduce batch-entropy regularization to enable gradient descent-based training algorithms to optimize the flow of information through each hidden layer individually. With batch-entropy regularization, gradient descent optimizers can transform untrainable networks into trainable networks. We show empirically that we can therefore train a "vanilla" fully connected network and convolutional neural network -- no skip connections, batch normalization, dropout, or any other architectural tweak -- with 500 layers by simply adding the batch-entropy regularization term to the loss function. The effect of batch-entropy regularization is not only evaluated on vanilla neural networks, but also on residual networks, autoencoders, and also transformer models over a wide range of computer vision as well as natural language processing tasks.

RONov 4, 2023
Constrained Equation Learner Networks for Precision-Preserving Extrapolation of Robotic Skills

Hector Perez-Villeda, Justus Piater, Matteo Saveriano

In Programming by Demonstration, the robot learns novel skills from human demonstrations. After learning, the robot should be able not only to reproduce the skill, but also to generalize it to shifted domains without collecting new training data. Adaptation to similar domains has been investigated in the literature; however, an open problem is how to adapt learned skills to different conditions that are outside of the data distribution, and, more important, how to preserve the precision of the desired adaptations. This paper presents a novel supervised learning framework called Constrained Equation Learner Networks that addresses the trajectory adaptation problem in Programming by Demonstrations from a constrained regression perspective. While conventional approaches for constrained regression use one kind of basis function, e.g., Gaussian, we exploit Equation Learner Networks to learn a set of analytical expressions and use them as basis functions. These basis functions are learned from demonstration with the objective to minimize deviations from the training data while imposing constraints that represent the desired adaptations, like new initial or final points or maintaining the trajectory within given bounds. Our approach addresses three main difficulties in adapting robotic trajectories: 1) minimizing the distortion of the trajectory for new adaptations; 2) preserving the precision of the adaptations; and 3) dealing with the lack of intuition about the structure of basis functions. We validate our approach both in simulation and in real experiments in a set of robotic tasks that require adaptation due to changes in the environment, and we compare obtained results with two existing approaches. Performed experiments show that Constrained Equation Learner Networks outperform state of the art approaches by increasing generalization and adaptability of robotic skills.

RODec 31, 2023Code
Effect of Optimizer, Initializer, and Architecture of Hypernetworks on Continual Learning from Demonstration

Sayantan Auddy, Sebastian Bergner, Justus Piater

In continual learning from demonstration (CLfD), a robot learns a sequence of real-world motion skills continually from human demonstrations. Recently, hypernetworks have been successful in solving this problem. In this paper, we perform an exploratory study of the effects of different optimizers, initializers, and network architectures on the continual learning performance of hypernetworks for CLfD. Our results show that adaptive learning rate optimizers work well, but initializers specially designed for hypernetworks offer no advantages for CLfD. We also show that hypernetworks that are capable of stable trajectory predictions are robust to different network architectures. Our open-source code is available at https://github.com/sebastianbergner/ExploringCLFD.

LGNov 11, 2025
Dynamic Sparsity: Challenging Common Sparsity Assumptions for Learning World Models in Robotic Reinforcement Learning Benchmarks

Muthukumar Pandaram, Jakob Hollenstein, David Drexel et al.

The use of learned dynamics models, also known as world models, can improve the sample efficiency of reinforcement learning. Recent work suggests that the underlying causal graphs of such dynamics models are sparsely connected, with each of the future state variables depending only on a small subset of the current state variables, and that learning may therefore benefit from sparsity priors. Similarly, temporal sparsity, i.e. sparsely and abruptly changing local dynamics, has also been proposed as a useful inductive bias. In this work, we critically examine these assumptions by analyzing ground-truth dynamics from a set of robotic reinforcement learning environments in the MuJoCo Playground benchmark suite, aiming to determine whether the proposed notions of state and temporal sparsity actually tend to hold in typical reinforcement learning tasks. We study (i) whether the causal graphs of environment dynamics are sparse, (ii) whether such sparsity is state-dependent, and (iii) whether local system dynamics change sparsely. Our results indicate that global sparsity is rare, but instead the tasks show local, state-dependent sparsity in their dynamics and this sparsity exhibits distinct structures, appearing in temporally localized clusters (e.g., during contact events) and affecting specific subsets of state dimensions. These findings challenge common sparsity prior assumptions in dynamics learning, emphasizing the need for grounded inductive biases that reflect the state-dependent sparsity structure of real-world dynamics.

ROFeb 14, 2022Code
Continual Learning from Demonstration of Robotics Skills

Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano et al.

Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movement skills without forgetting what was learned in the past. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equation solvers. We empirically demonstrate the effectiveness of this approach in remembering long sequences of trajectory learning tasks without the need to store any data from past demonstrations. Our results show that hypernetworks outperform other state-of-the-art continual learning approaches for learning from demonstration. In our experiments, we use the popular LASA benchmark, and two new datasets of kinesthetic demonstrations collected with a real robot that we introduce in this paper called the HelloWorld and RoboTasks datasets. We evaluate our approach on a physical robot and demonstrate its effectiveness in learning real-world robotic tasks involving changing positions as well as orientations. We report both trajectory error metrics and continual learning metrics, and we propose two new continual learning metrics. Our code, along with the newly collected datasets, is available at https://github.com/sayantanauddy/clfd.

LGDec 18, 2023
Colored Noise in PPO: Improved Exploration and Performance through Correlated Action Sampling

Jakob Hollenstein, Georg Martius, Justus Piater

Proximal Policy Optimization (PPO), a popular on-policy deep reinforcement learning method, employs a stochastic policy for exploration. In this paper, we propose a colored noise-based stochastic policy variant of PPO. Previous research highlighted the importance of temporal correlation in action noise for effective exploration in off-policy reinforcement learning. Building on this, we investigate whether correlated noise can also enhance exploration in on-policy methods like PPO. We discovered that correlated noise for action selection improves learning performance and outperforms the currently popular uncorrelated white noise approach in on-policy methods. Unlike off-policy learning, where pink noise was found to be highly effective, we found that a colored noise, intermediate between white and pink, performed best for on-policy learning in PPO. We examined the impact of varying the amount of data collected for each update by modifying the number of parallel simulation environments for data collection and observed that with a larger number of parallel environments, more strongly correlated noise is beneficial. Due to the significant impact and ease of implementation, we recommend switching to correlated noise as the default noise source in PPO.

RODec 29, 2023
Unified Task and Motion Planning using Object-centric Abstractions of Motion Constraints

Alejandro Agostini, Justus Piater

In task and motion planning (TAMP), the ambiguity and underdetermination of abstract descriptions used by task planning methods make it difficult to characterize physical constraints needed to successfully execute a task. The usual approach is to overlook such constraints at task planning level and to implement expensive sub-symbolic geometric reasoning techniques that perform multiple calls on unfeasible actions, plan corrections, and re-planning until a feasible solution is found. We propose an alternative TAMP approach that unifies task and motion planning into a single heuristic search. Our approach is based on an object-centric abstraction of motion constraints that permits leveraging the computational efficiency of off-the-shelf AI heuristic search to yield physically feasible plans. These plans can be directly transformed into object and motion parameters for task execution without the need of intensive sub-symbolic geometric reasoning.

ROApr 3, 2024
Unsupervised Learning of Effective Actions in Robotics

Marko Zaric, Jakob Hollenstein, Justus Piater et al.

Learning actions that are relevant to decision-making and can be executed effectively is a key problem in autonomous robotics. Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions. Although successful in solving manipulation tasks, deep learning methods also lack this ability, in addition to their high cost in terms of memory or training data. In this paper, we propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes", each producing different effects in the environment. After an exploration phase, the algorithm automatically builds a representation of the effects and groups motions into action prototypes, where motions more likely to produce an effect are represented more than those that lead to negligible changes. We evaluate our method on a simulated stair-climbing reinforcement learning task, and the preliminary results show that our effect driven discretization outperforms uniformly and randomly sampled discretizations in convergence speed and maximum reward.

ROMar 17, 2021
Learning Descriptor of Constrained Task from Demonstration

Xiang Zhang, Matteo Saveriano, Justus Piater

Constrained objects, such as doors and drawers are often complex and share a similar structure in the human environment. A robot needs to interact accurately with constrained objects to safely and successfully complete a task. Learning from Demonstration offers an appropriate path to learn the object structure of the demonstration for unknown objects for unknown tasks. There is work that extracts the kinematic model from motion. However, the gap remains when the robot faces a new object with a similar model but different contexts, e.g. size, appearance, etc. In this paper, we propose a framework that integrates all the information needed to learn a constrained motion from a depth camera into a descriptor of the constrained task. The descriptor consists of object information, grasping point model, constrained model, and reference frame model. By associating constrained learning and reference frame with the constrained object, we demonstrate that the robot can learn the book opening model and parameter of the constraints from demonstration and generalize to novel books.

RODec 4, 2020
DeepSym: Deep Symbol Generation and Rule Learning from Unsupervised Continuous Robot Interaction for Planning

Alper Ahmetoglu, M. Yunus Seker, Justus Piater et al.

We propose a novel general method that finds action-grounded, discrete object and effect categories and builds probabilistic rules over them for non-trivial action planning. Our robot interacts with objects using an initial action repertoire that is assumed to be acquired earlier and observes the effects it can create in the environment. To form action-grounded object, effect, and relational categories, we employ a binary bottleneck layer in a predictive, deep encoder-decoder network that takes the image of the scene and the action applied as input, and generates the resulting effects in the scene in pixel coordinates. After learning, the binary latent vector represents action-driven object categories based on the interaction experience of the robot. To distill the knowledge represented by the neural network into rules useful for symbolic reasoning, a decision tree is trained to reproduce its decoder function. Probabilistic rules are extracted from the decision paths of the tree and are represented in the Probabilistic Planning Domain Definition Language (PPDDL), allowing off-the-shelf planners to operate on the knowledge extracted from the sensorimotor experience of the robot. The deployment of the proposed approach for a simulated robotic manipulator enabled the discovery of discrete representations of object properties such as `rollable' and `insertable'. In turn, the use of these representations as symbols allowed the generation of effective plans for achieving goals, such as building towers of the desired height, demonstrating the effectiveness of the approach for multi-step object manipulation. Finally, we demonstrate that the system is not only restricted to the robotics domain by assessing its applicability to the MNIST 8-puzzle domain in which learned symbols allow for the generation of plans that move the empty tile into any given position.

LGOct 29, 2020
How do Offline Measures for Exploration in Reinforcement Learning behave?

Jakob J. Hollenstein, Sayantan Auddy, Matteo Saveriano et al.

Sufficient exploration is paramount for the success of a reinforcement learning agent. Yet, exploration is rarely assessed in an algorithm-independent way. We compare the behavior of three data-based, offline exploration metrics described in the literature on intuitive simple distributions and highlight problems to be aware of when using them. We propose a fourth metric,uniform relative entropy, and implement it using either a k-nearest-neighbor or a nearest-neighbor-ratio estimator, highlighting that the implementation choices have a profound impact on these measures.

LGOct 24, 2020
Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search

Jakob J. Hollenstein, Erwan Renaudo, Matteo Saveriano et al.

Local policy search is performed by most Deep Reinforcement Learning (D-RL) methods, which increases the risk of getting trapped in a local minimum. Furthermore, the availability of a simulation model is not fully exploited in D-RL even in simulation-based training, which potentially decreases efficiency. To better exploit simulation models in policy search, we propose to integrate a kinodynamic planner in the exploration strategy and to learn a control policy in an offline fashion from the generated environment interactions. We call the resulting model-based reinforcement learning method PPS (Planning for Policy Search). We compare PPS with state-of-the-art D-RL methods in typical RL settings including underactuated systems. The comparison shows that PPS, guided by the kinodynamic planner, collects data from a wider region of the state space. This generates training data that helps PPS discover better policies.

ROJul 21, 2020
Reconfigurable Behavior Trees: Towards an Executive Framework Meeting High-level Decision Making and Control Layer Features

Pilar de la Cruz, Justus Piater, Matteo Saveriano

Behavior Trees constitute a widespread AI tool which has been successfully spun out in robotics. Their advantages include simplicity, modularity, and reusability of code. However, Behavior Trees remain a high-level decision making engine; control features cannot be easily integrated. This paper proposes the Reconfigurable Behavior Trees (RBTs), an extension of the traditional BTs that considers physical constraints from the robotic environment in the decision making process. We endow RBTs with continuous sensory information that permits the online monitoring of the task execution. The resulting stimulus-driven architecture is capable of dynamically handling changes in the executive context while keeping the execution time low. The proposed framework is evaluated on a set of robotic experiments. The results show that RBTs are a promising approach for robotic task representation, monitoring, and execution.

CVJan 29, 2020
Evaluating the Progress of Deep Learning for Visual Relational Concepts

Sebastian Stabinger, Peer David, Justus Piater et al.

Convolutional Neural Networks (CNNs) have become the state of the art method for image classification in the last ten years. Despite the fact that they achieve superhuman classification accuracy on many popular datasets, they often perform much worse on more abstract image classification tasks. We will show that these difficult tasks are linked to relational concepts from cognitive psychology and that despite progress over the last few years, such relational reasoning tasks still remain difficult for current neural network architectures. We will review deep learning research that is linked to relational concept learning, even if it was not originally presented from this angle. Reviewing the current literature, we will argue that some form of attention will be an important component of future systems to solve relational tasks. In addition, we will point out the shortcomings of currently used datasets, and we will recommend steps to make future datasets more relevant for testing systems on relational reasoning.

ROSep 12, 2018
Action Representations in Robotics: A Taxonomy and Systematic Classification

Philipp Zech, Erwan Renaudo, Simon Haller et al.

Understanding and defining the meaning of "action" is substantial for robotics research. This becomes utterly evident when aiming at equipping autonomous robots with robust manipulation skills for action execution. Unfortunately, to this day we still lack both a clear understanding of the concept of an action and a set of established criteria that ultimately characterize an action. In this survey we thus first review existing ideas and theories on the notion and meaning of action. Subsequently we discuss the role of action in robotics and attempt to give a seminal definition of action in accordance with its use in robotics research. Given this definition we then introduce a taxonomy for categorizing action representations in robotics along various dimensions. Finally, we provide a systematic literature survey on action representations in robotics where we categorize relevant literature along our taxonomy. After discussing the current state of the art we conclude with an outlook towards promising research directions.

ROMay 11, 2018
Learning Movement Assessment Primitives for Force Interaction Skills

Xiang Zhang, Athanasios S. Polydoros, Justus Piater

We present a novel, reusable and task-agnostic primitive for assessing the outcome of a force-interaction robotic skill, useful e.g.\ for applications such as quality control in industrial manufacturing. The proposed method is easily programmed by kinesthetic teaching, and the desired adaptability and reusability are achieved by machine learning models. The primitive records sensory data during both demonstrations and reproductions of a movement. Recordings include the end-effector's Cartesian pose and exerted wrench at each time step. The collected data are then used to train Gaussian Processes which create models of the wrench as a function of the robot's pose. The similarity between the wrench models of the demonstration and the movement's reproduction is derived by measuring their Hellinger distance. This comparison creates features that are fed as inputs to a Naive Bayes classifier which estimates the movement's probability of success. The evaluation is performed on two diverse robotic assembly tasks -- snap-fitting and screwing -- with a total of 5 use cases, 11 demonstrations, and more than 200 movement executions. The performance metrics prove the proposed method's capability of generalization to different demonstrations and movements.

AIJan 26, 2018
Symbol Emergence in Cognitive Developmental Systems: a Survey

Tadahiro Taniguchi, Emre Ugur, Matej Hoffmann et al.

Humans use signs, e.g., sentences in a spoken language, for communication and thought. Hence, symbol systems like language are crucial for our communication with other agents and adaptation to our real-world environment. The symbol systems we use in our human society adaptively and dynamically change over time. In the context of artificial intelligence (AI) and cognitive systems, the symbol grounding problem has been regarded as one of the central problems related to {\it symbols}. However, the symbol grounding problem was originally posed to connect symbolic AI and sensorimotor information and did not consider many interdisciplinary phenomena in human communication and dynamic symbol systems in our society, which semiotics considered. In this paper, we focus on the symbol emergence problem, addressing not only cognitive dynamics but also the dynamics of symbol systems in society, rather than the symbol grounding problem. We first introduce the notion of a symbol in semiotics from the humanities, to leave the very narrow idea of symbols in symbolic AI. Furthermore, over the years, it became more and more clear that symbol emergence has to be regarded as a multifaceted problem. Therefore, secondly, we review the history of the symbol emergence problem in different fields, including both biological and artificial systems, showing their mutual relations. We summarize the discussion and provide an integrative viewpoint and comprehensive overview of symbol emergence in cognitive systems. Additionally, we describe the challenges facing the creation of cognitive systems that can be part of symbol emergence systems.

ROSep 18, 2017
A novel Skill-based Programming Paradigm based on Autonomous Playing and Skill-centric Testing

Simon Hangl, Andreas Mennel, Justus Piater

We introduce a novel paradigm for robot pro- gramming with which we aim to make robot programming more accessible for unexperienced users. In order to do so we incorporate two major components in one single framework: autonomous skill acquisition by robotic playing and visual programming. Simple robot program skeletons solving a task for one specific situation, so-called basic behaviours, are provided by the user. The robot then learns how to solve the same task in many different situations by autonomous playing which reduces the barrier for unexperienced robot programmers. Programmers can use a mix of visual programming and kinesthetic teaching in order to provide these simple program skeletons. The robot program can be implemented interactively by programming parts with visual programming and kinesthetic teaching. We further integrate work on experience-based skill-centric robot software testing which enables the user to continuously test implemented skills without having to deal with the details of specific components.

ROJun 26, 2017
Skill Learning by Autonomous Robotic Playing using Active Learning and Creativity

Simon Hangl, Vedran Dunjko, Hans J. Briegel et al.

We treat the problem of autonomous acquisition of manipulation skills where problem-solving strategies are initially available only for a narrow range of situations. We propose to extend the range of solvable situations by autonomous playing with the object. By applying previously-trained skills and behaviours, the robot learns how to prepare situations for which a successful strategy is already known. The information gathered during autonomous play is additionally used to learn an environment model. This model is exploited for active learning and the creative generation of novel preparatory behaviours. We apply our approach on a wide range of different manipulation tasks, e.g. book grasping, grasping of objects of different sizes by selecting different grasping strategies, placement on shelves, and tower disassembly. We show that the creative behaviour generation mechanism enables the robot to solve previously-unsolvable tasks, e.g. tower disassembly. We use success statistics gained during real-world experiments to simulate the convergence behaviour of our system. Experiments show that active improves the learning speed by around 9 percent in the book grasping scenario.

ROMar 2, 2017
Autonomous Skill-centric Testing using Deep Learning

Simon Hangl, Sebastian Stabinger, Justus Piater

Software testing is an important tool to ensure software quality. This is a hard task in robotics due to dynamic environments and the expensive development and time-consuming execution of test cases. Most testing approaches use model-based and / or simulation-based testing to overcome these problems. We propose model-free skill-centric testing in which a robot autonomously executes skills in the real world and compares it to previous experiences. The skills are selected by maximising the expected information gain on the distribution of erroneous software functions. We use deep learning to model the sensor data observed during previous successful skill executions and to detect irregularities. Sensor data is connected to function call profiles such that certain misbehaviour can be related to specific functions. We evaluate our approach in simulation and in experiments with a KUKA LWR 4+ robot by purposefully introducing bugs to the software. We demonstrate that these bugs can be detected with high accuracy and without the need for the implementation of specific tests or task-specific models.

RONov 19, 2016
Active and Transfer Learning of Grasps by Kernel Adaptive MCMC

Philipp Zech, Hanchen Xiong, Justus Piater

Human ability of both versatile grasping of given objects and grasping of novel (as of yet unseen) objects is truly remarkable. This probably arises from the experience infants gather by actively playing around with diverse objects. Moreover, knowledge acquired during this process is reused during learning of how to grasp novel objects. We conjecture that this combined process of active and transfer learning boils down to a random search around an object, suitably biased by prior experience, to identify promising grasps. In this paper we present an active learning method for learning of grasps for given objects, and a transfer learning method for learning of grasps for novel objects. Our learning methods apply a kernel adaptive Metropolis-Hastings sampler that learns an approximation of the grasps' probability density of an object while drawing grasp proposals from it. The sampler employs simulated annealing to search for globally-optimal grasps. Our empirical results show promising applicability of our proposed learning schemes.

RONov 19, 2016
Active and Transfer Learning of Grasps by Sampling from Demonstration

Philipp Zech, Justus Piater

We guess humans start acquiring grasping skills as early as at the infant stage by virtue of two key processes. First, infants attempt to learn grasps for known objects by imitating humans. Secondly, knowledge acquired during this process is reused in learning to grasp novel objects. We argue that these processes of active and transfer learning boil down to a random search of grasps on an object, suitably biased by prior experience. In this paper we introduce active learning of grasps for known objects as well as transfer learning of grasps for novel objects grounded on kernel adaptive, mode-hopping Markov Chain Monte Carlo. Our experiments show promising applicability of our proposed learning methods.

RONov 19, 2016
Grasp Learning by Sampling from Demonstration

Philipp Zech, Justus Piater

Robotic grasping traditionally relies on object features or shape information for learning new or applying already learned grasps. We argue however that such a strong reliance on object geometric information renders grasping and grasp learning a difficult task in the event of cluttered environments with high uncertainty where reasonable object models are not available. This being so, in this paper we thus investigate the application of model-free stochastic optimization for grasp learning. For this, our proposed learning method requires just a handful of user-demonstrated grasps and an initial prior by a rough sketch of an object's grasp affordance density, yet no object geometric knowledge except for its pose. Our experiments show promising applicability of our proposed learning method.

CVJul 28, 2016
25 years of CNNs: Can we compare to human abstraction capabilities?

Sebastian Stabinger, Antonio Rodríguez-Sánchez, Justus Piater

We try to determine the progress made by convolutional neural networks over the past 25 years in classifying images into abstractc lasses. For this purpose we compare the performance of LeNet to that of GoogLeNet at classifying randomly generated images which are differentiated by an abstract property (e.g., one class contains two objects of the same size, the other class two objects of different sizes). Our results show that there is still work to do in order to solve vision problems humans are able to solve without much difficulty.

CVJun 17, 2016
Learning Abstract Classes using Deep Learning

Sebastian Stabinger, Antonio Rodriguez-Sanchez, Justus Piater

Humans are generally good at learning abstract concepts about objects and scenes (e.g.\ spatial orientation, relative sizes, etc.). Over the last years convolutional neural networks have achieved almost human performance in recognizing concrete classes (i.e.\ specific object categories). This paper tests the performance of a current CNN (GoogLeNet) on the task of differentiating between abstract classes which are trivially differentiable for humans. We trained and tested the CNN on the two abstract classes of horizontal and vertical orientation and determined how well the network is able to transfer the learned classes to other, previously unseen objects.

ROMar 2, 2016
Robotic Playing for Hierarchical Complex Skill Learning

Simon Hangl, Emre Ugur, Sandor Szedmak et al.

In complex manipulation scenarios (e.g. tasks requiring complex interaction of two hands or in-hand manipulation), generalization is a hard problem. Current methods still either require a substantial amount of (supervised) training data and / or strong assumptions on both the environment and the task. In this paradigm, controllers solving these tasks tend to be complex. We propose a paradigm of maintaining simpler controllers solving the task in a small number of specific situations. In order to generalize to novel situations, the robot transforms the environment from novel situations into a situation where the solution of the task is already known. Our solution to this problem is to play with objects and use previously trained skills (basis skills). These skills can either be used for estimating or for changing the current state of the environment and are organized in skill hierarchies. The approach is evaluated in complex pick-and-place scenarios that involve complex manipulation. We further show that these skills can be learned by autonomous playing.