LGOct 24, 2022
Understanding the Evolution of Linear Regions in Deep Reinforcement LearningSetareh Cohan, Nam Hee Kim, David Rolnick et al.
Policies produced by deep reinforcement learning are typically characterised by their learning curves, but they remain poorly understood in many other respects. ReLU-based policies result in a partitioning of the input space into piecewise linear regions. We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. Intuitively, we may expect that during training, the region density increases in the areas that are frequently visited by the policy, thereby affording fine-grained control. We use recent theoretical and empirical results for the linear regions induced by neural networks in supervised learning settings for grounding and comparison of our results. Empirically, we find that the region density increases only moderately throughout training, as measured along fixed trajectories coming from the final policy. However, the trajectories themselves also increase in length during training, and thus the region densities decrease as seen from the perspective of the current trajectory. Our findings suggest that the complexity of deep reinforcement learning policies does not principally emerge from a significant growth in the complexity of functions observed on-and-around trajectories of the policy.
LGJul 31, 2023
Don't be so negative! Score-based Generative Modeling with Oracle-assisted GuidanceSaeid Naderiparizi, Xiaoxuan Liang, Setareh Cohan et al.
Score-based diffusion models are a powerful class of generative models, widely utilized across diverse domains. Despite significant advancements in large-scale tasks such as text-to-image generation, their application to constrained domains has received considerably less attention. This work addresses model learning in a setting where, in addition to the training dataset, there further exists side-information in the form of an oracle that can label samples as being outside the support of the true data generating distribution. Specifically we develop a new denoising diffusion probabilistic modeling methodology, Gen-neG, that leverages this additional side-information. Gen-neG builds on classifier guidance in diffusion models to guide the generation process towards the positive support region indicated by the oracle. We empirically establish the utility of Gen-neG in applications including collision avoidance in self-driving simulators and safety-guarded human motion generation.
CVMay 17, 2024
Flexible Motion In-betweening with Diffusion ModelsSetareh Cohan, Guy Tevet, Daniele Reda et al.
Motion in-betweening, a fundamental task in character animation, consists of generating motion sequences that plausibly interpolate user-provided keyframe constraints. It has long been recognized as a labor-intensive and challenging process. We investigate the potential of diffusion models in generating diverse human motions guided by keyframes. Unlike previous inbetweening methods, we propose a simple unified model capable of generating precise and diverse motions that conform to a flexible range of user-specified spatial constraints, as well as text conditioning. To this end, we propose Conditional Motion Diffusion In-betweening (CondMDI) which allows for arbitrary dense-or-sparse keyframe placement and partial keyframe constraints while generating high-quality motions that are diverse and coherent with the given keyframes. We evaluate the performance of CondMDI on the text-conditioned HumanML3D dataset and demonstrate the versatility and efficacy of diffusion models for keyframe in-betweening. We further explore the use of guidance and imputation-based approaches for inference-time keyframing and compare CondMDI against these methods.
ROFeb 26, 2025
Diffusion-based Planning with Learned Viability FiltersNicholas Ioannidis, Daniele Reda, Setareh Cohan et al.
Diffusion models can be used as a motion planner by sampling from a distribution of possible futures. However, the samples may not satisfy hard constraints that exist only implicitly in the training data, e.g., avoiding falls or not colliding with a wall. We propose learned viability filters that efficiently predict the future success of any given plan, i.e., diffusion sample, and thereby enforce an implicit future-success constraint. Multiple viability filters can also be composed together. We demonstrate the approach on detailed footstep planning for challenging 3D human locomotion tasks, showing the effectiveness of viability filters in performing online planning and control for box-climbing, step-over walls, and obstacle avoidance. We further show that using viability filters is significantly faster than guidance-based diffusion prediction.