AIFeb 3
GFlowPO: Generative Flow Network as a Language Model Prompt OptimizerJunmo Cho, Suhan Kim, Sangjune An et al.
Finding effective prompts for language models (LMs) is critical yet notoriously difficult: the prompt space is combinatorially large, rewards are sparse due to expensive target-LM evaluation. Yet, existing RL-based prompt optimizers often rely on on-policy updates and a meta-prompt sampled from a fixed distribution, leading to poor sample efficiency. We propose GFlowPO, a probabilistic prompt optimization framework that casts prompt search as a posterior inference problem over latent prompts regularized by a meta-prompted reference-LM prior. In the first step, we fine-tune a lightweight prompt-LM with an off-policy Generative Flow Network (GFlowNet) objective, using a replay-based training policy that reuses past prompt evaluations to enable sample-efficient exploration. In the second step, we introduce Dynamic Memory Update (DMU), a training-free mechanism that updates the meta-prompt by injecting both (i) diverse prompts from a replay buffer and (ii) top-performing prompts from a small priority queue, thereby progressively concentrating the search process on high-reward regions. Across few-shot text classification, instruction induction benchmarks, and question answering tasks, GFlowPO consistently outperforms recent discrete prompt optimization baselines.
ROAug 5, 2025
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive controlYi-Hsuan Hsiao, Andrea Tagliabue, Owen Matteson et al.
Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.
ROMar 4, 2020
Touch the Wind: Simultaneous Airflow, Drag and Interaction Sensing on a MultirotorAndrea Tagliabue, Aleix Paris, Suhan Kim et al.
Disturbance estimation for Micro Aerial Vehicles (MAVs) is crucial for robustness and safety. In this paper, we use novel, bio-inspired airflow sensors to measure the airflow acting on a MAV, and we fuse this information in an Unscented Kalman Filter (UKF) to simultaneously estimate the three-dimensional wind vector, the drag force, and other interaction forces (e.g. due to collisions, interaction with a human) acting on the robot. To this end, we present and compare a fully model-based and a deep learning-based strategy. The model-based approach considers the MAV and airflow sensor dynamics and its interaction with the wind, while the deep learning-based strategy uses a Long Short-Term Memory (LSTM) neural network to obtain an estimate of the relative airflow, which is then fused in the proposed filter. We validate our methods in hardware experiments, showing that we can accurately estimate relative airflow of up to 4 m/s, and we can differentiate drag and interaction force.