CVDec 3, 2025
KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion ModelsRhys Newbury, Juyan Zhang, Tin Tran et al.
Understanding and representing the structure of 3D objects in an unsupervised manner remains a core challenge in computer vision and graphics. Most existing unsupervised keypoint methods are not designed for unconditional generative settings, restricting their use in modern 3D generative pipelines; our formulation explicitly bridges this gap. We present an unsupervised framework for learning spatially structured 3D keypoints from point cloud data. These keypoints serve as a compact and interpretable representation that conditions an Elucidated Diffusion Model (EDM) to reconstruct the full shape. The learned keypoints exhibit repeatable spatial structure across object instances and support smooth interpolation in keypoint space, indicating that they capture geometric variation. Our method achieves strong performance across diverse object categories, yielding a 6 percentage-point improvement in keypoint consistency compared to prior approaches.
LGAug 3, 2025
Why Heuristic Weighting Works: A Theoretical Analysis of Denoising Score MatchingJuyan Zhang, Rhys Newbury, Xinyang Zhang et al.
Score matching enables the estimation of the gradient of a data distribution, a key component in denoising diffusion models used to recover clean data from corrupted inputs. In prior work, a heuristic weighting function has been used for the denoising score matching loss without formal justification. In this work, we demonstrate that heteroskedasticity is an inherent property of the denoising score matching objective. This insight leads to a principled derivation of optimal weighting functions for generalized, arbitrary-order denoising score matching losses, without requiring assumptions about the noise distribution. Among these, the first-order formulation is especially relevant to diffusion models. We show that the widely used heuristical weighting function arises as a first-order Taylor approximation to the trace of the expected optimal weighting. We further provide theoretical and empirical comparisons, revealing that the heuristical weighting, despite its simplicity, can achieve lower variance than the optimal weighting with respect to parameter gradients, which can facilitate more stable and efficient training.
ROFeb 25, 2022
Visibility Maximization Controller for Robotic ManipulationKerry He, Rhys Newbury, Tin Tran et al.
Occlusions caused by a robot's own body is a common problem for closed-loop control methods employed in eye-to-hand camera setups. We propose an optimization-based reactive controller that minimizes self-occlusions while achieving a desired goal pose. The approach allows coordinated control between the robot's base, arm and head by encoding the line-of-sight visibility to the target as a soft constraint along with other task-related constraints, and solving for feasible joint and base velocities. The generalizability of the approach is demonstrated in simulated and real-world experiments, on robots with fixed or mobile bases, with moving or fixed objects, and multiple objects. The experiments revealed a trade-off between occlusion rates and other task metrics. While a planning-based baseline achieved lower occlusion rates than the proposed controller, it came at the expense of highly inefficient paths and a significant drop in the task success. On the other hand, the proposed controller is shown to improve visibility to the line target object(s) without sacrificing too much from the task success and efficiency. Videos and code can be found at: rhys-newbury.github.io/projects/vmc/.
ROAug 29, 2021
An Experimental Validation and Comparison of Reaching Motion Models for Unconstrained Handovers: Towards Generating Humanlike Motions for Human-Robot HandoversWesley P. Chan, Tin Tran, Sara Sheikholeslami et al.
The Minimum Jerk motion model has long been cited in literature for human point-to-point reaching motions in single-person tasks. While it has been demonstrated that applying minimum-jerk-like trajectories to robot reaching motions in the joint action task of human-robot handovers allows a robot giver to be perceived as more careful, safe, and skilled, it has not been verified whether human reaching motions in handovers follow the Minimum Jerk model. To experimentally test and verify motion models for human reaches in handovers, we examined human reaching motions in unconstrained handovers (where the person is allowed to move their whole body) and fitted against 1) the Minimum Jerk model, 2) its variation, the Decoupled Minimum Jerk model, and 3) the recently proposed Elliptical (Conic) model. Results showed that Conic model fits unconstrained human handover reaching motions best. Furthermore, we discovered that unlike constrained, single-person reaching motions, which have been found to be elliptical, there is a split between elliptical and hyperbolic conic types. We expect our results will help guide generation of more humanlike reaching motions for human-robot handover tasks.
ROApr 7, 2021
Demonstrating Cloth Folding to Robots: Design and Evaluation of a 2D and a 3D User InterfaceBenjamin Waymouth, Akansel Cosgun, Rhys Newbury et al.
An appropriate user interface to collect human demonstration data for deformable object manipulation has been mostly overlooked in the literature. We present an interaction design for demonstrating cloth folding to robots. Users choose pick and place points on the cloth and can preview a visualization of a simulated cloth before real-robot execution. Two interfaces are proposed: A 2D display-and-mouse interface where points are placed by clicking on an image of the cloth, and a 3D Augmented Reality interface where the chosen points are placed by hand gestures. We conduct a user study with 18 participants, in which each user completed two sequential folds to achieve a cloth goal shape. Results show that while both interfaces were acceptable, the 3D interface was found to be more suitable for understanding the task, and the 2D interface suitable for repetition. Results also found that fold previews improve three key metrics: task efficiency, the ability to predict the final shape of the cloth and overall user satisfaction.