ROSep 23, 2024
Learning Diverse Robot Striking Motions with Diffusion Models and Kinematically Constrained Gradient GuidanceKin Man Lee, Sean Ye, Qingyu Xiao et al.
Advances in robot learning have enabled robots to generate skills for a variety of tasks. Yet, robot learning is typically sample inefficient, struggles to learn from data sources exhibiting varied behaviors, and does not naturally incorporate constraints. These properties are critical for fast, agile tasks such as playing table tennis. Modern techniques for learning from demonstration improve sample efficiency and scale to diverse data, but are rarely evaluated on agile tasks. In the case of reinforcement learning, achieving good performance requires training on high-fidelity simulators. To overcome these limitations, we develop a novel diffusion modeling approach that is offline, constraint-guided, and expressive of diverse agile behaviors. The key to our approach is a kinematic constraint gradient guidance (KCGG) technique that computes gradients through both the forward kinematics of the robot arm and the diffusion model to direct the sampling process. KCGG minimizes the cost of violating constraints while simultaneously keeping the sampled trajectory in-distribution of the training data. We demonstrate the effectiveness of our approach for time-critical robotic tasks by evaluating KCGG in two challenging domains: simulated air hockey and real table tennis. In simulated air hockey, we achieved a 25.4% increase in block rate, while in table tennis, we saw a 17.3% increase in success rate compared to imitation learning baselines.
ROJan 30, 2024
Multi-Camera Asynchronous Ball Localization and Trajectory Prediction with Factor Graphs and Human PosesQingyu Xiao, Zulfiqar Zaidi, Matthew Gombolay
The rapid and precise localization and prediction of a ball are critical for developing agile robots in ball sports, particularly in sports like tennis characterized by high-speed ball movements and powerful spins. The Magnus effect induced by spin adds complexity to trajectory prediction during flight and bounce dynamics upon contact with the ground. In this study, we introduce an innovative approach that combines a multi-camera system with factor graphs for real-time and asynchronous 3D tennis ball localization. Additionally, we estimate hidden states like velocity and spin for trajectory prediction. Furthermore, to enhance spin inference early in the ball's flight, where limited observations are available, we integrate human pose data using a temporal convolutional network (TCN) to compute spin priors within the factor graph. This refinement provides more accurate spin priors at the beginning of the factor graph, leading to improved early-stage hidden state inference for prediction. Our result shows the trained TCN can predict the spin priors with RMSE of 5.27 Hz. Integrating TCN into the factor graph reduces the prediction error of landing positions by over 63.6% compared to a baseline method that utilized an adaptive extended Kalman filter.
LGOct 10, 2025
Constraints-of-Thought: A Framework for Constrained Reasoning in Language-Model-Guided SearchKamel Alrashedy, Vriksha Srihari, Zulfiqar Zaidi et al.
While researchers have made significant progress in enabling large language models (LLMs) to perform multi-step planning, LLMs struggle to ensure that those plans align with high-level user intent and satisfy symbolic constraints, especially in complex, multi-step domains. Existing reasoning approaches such as Chain-of-Thought (CoT), Tree-of-Thought (ToT), and verifier-augmented methods, expand the search space but often yield infeasible actions or hallucinated steps. To overcome these limitations, we propose Constraints-of-Thought (Const-o-T), a framework that provides a structured prior that enables Monte Carlo Tree Search (MCTS) focus search on semantically meaningful paths. Each reasoning step is represented as an (intent, constraint) pair, which serves both to compress the search space and enforce validity. Unlike prior methods that merely generate reasoning traces or validate outputs post hoc, Const-o-T uses (intent, constraint)pairs to actively focus the search toward feasible and meaningful plans. We integrate Const-o-T into MCTS using a structured representation of intent-constraint pairs constraints prune infeasible branches and guide exploration toward semantically valid actions, improving planning efficiency and verifiable decision-making. We demonstrate across three domains Risk game, CAD code generation, and arithmetic reasoning that our approach outperforms baselines, yielding higher accuracy and stronger structural alignment. Our contribution is to demonstrate that Const-of-T offers a generalizable foundation for constraint-guided reasoning, enabling more efficient, constraint-aligned, and domain-adaptable planning with LLMs.