LGFeb 29, 2024
Dr. Strategy: Model-Based Generalist Agents with Strategic DreamingHany Hamed, Subin Kim, Dongyeong Kim et al.
Model-based reinforcement learning (MBRL) has been a primary approach to ameliorating the sample efficiency issue as well as to make a generalist agent. However, there has not been much effort toward enhancing the strategy of dreaming itself. Therefore, it is a question whether and how an agent can "dream better" in a more structured and strategic way. In this paper, inspired by the observation from cognitive science suggesting that humans use a spatial divide-and-conquer strategy in planning, we propose a new MBRL agent, called Dr. Strategy, which is equipped with a novel Dreaming Strategy. The proposed agent realizes a version of divide-and-conquer-like strategy in dreaming. This is achieved by learning a set of latent landmarks and then utilizing these to learn a landmark-conditioned highway policy. With the highway policy, the agent can first learn in the dream to move to a landmark, and from there it tackles the exploration and achievement task in a more focused way. In experiments, we show that the proposed model outperforms prior pixel-based MBRL methods in various visually complex and partially observable navigation tasks.
LGMar 25, 2025
Extendable Planning via Multiscale DiffusionChang Chen, Hany Hamed, Doojin Baek et al.
Long-horizon planning is crucial in complex environments, but diffusion-based planners like Diffuser are limited by the trajectory lengths observed during training. This creates a dilemma: long trajectories are needed for effective planning, yet they degrade model performance. In this paper, we introduce this extendable long-horizon planning challenge and propose a two-phase solution. First, Progressive Trajectory Extension incrementally constructs longer trajectories through multi-round compositional stitching. Second, the Hierarchical Multiscale Diffuser enables efficient training and inference over long horizons by reasoning across temporal scales. To avoid the need for multiple separate models, we propose Adaptive Plan Pondering and the Recursive HM-Diffuser, which unify hierarchical planning within a single model. Experiments show our approach yields strong performance gains, advancing scalable and efficient decision-making over long-horizons.
ROFeb 12, 2022
Optimization-based Trajectory Tracking Approach for Multi-rotor Aerial Vehicles in Unknown EnvironmentsGeesara Kulathunga, Hany Hamed, Dmitry Devitt et al.
The goal of this paper is to develop a continuous optimization-based refinement of the reference trajectory to 'push it out' of the obstacle-occupied space in the global phase for Multi-rotor Aerial Vehicles in unknown environments. Our proposed approach comprises two planners: a global planner and a local planner. The global planner refines the initial reference trajectory when the trajectory goes either through an obstacle or near an obstacle and lets the local planner calculate a near-optimal control policy. The global planner comprises two convex programming approaches: the first one helps to refine the reference trajectory, and the second one helps to recover the reference trajectory if the first approach fails to refine. The global planner mainly focuses on real-time performance and obstacles avoidance, whereas the proposed formulation of the constrained nonlinear model predictive control-based local planner ensures safety, dynamic feasibility, and the reference trajectory tracking accuracy for low-speed maneuvers, provided that local and global planners have mean computation times 0.06s (15Hz) and 0.05s (20Hz), respectively, on an NVIDIA Jetson Xavier NX computer. The results of our experiment confirmed that, in cluttered environments, the proposed approach outperformed three other approaches: sampling-based pathfinding followed by trajectory generation, a local planner, and graph-based pathfinding followed by trajectory generation.
ROApr 6, 2020
Learning Stabilizing Control Policies for a Tensegrity Hopper with Augmented Random SearchVladislav Kurenkov, Hany Hamed, Sergei Savin
In this paper, we consider tensegrity hopper - a novel tensegrity-based robot, capable of moving by hopping. The paper focuses on the design of the stabilizing control policies, which are obtained with Augmented Random Search method. In particular, we search for control policies which allow the hopper to maintain vertical stability after performing a single jump. It is demonstrated, that the hopper can maintain a vertical configuration, subject to the different initial conditions and with changing control frequency rates. In particular, lowering control frequency from 1000Hz in training to 500Hz in execution did not affect the success rate of the balancing task.