70.2ROMay 12
Premover: Fast Vision-Language-Action Control by Acting Before Instructions Are CompleteJoonha Park, Jiseung Jeong, Taesik Gong
Vision-Language-Action (VLA) policies are typically evaluated as if the user had finished typing or speaking before the robot begins acting. In real deployment, however, users take several seconds to enter a request, leaving the policy idle for a substantial fraction of the interaction. We introduce Premover, a lightweight module that converts this idle window into useful precomputation. Premover keeps the VLA backbone frozen and attaches two small projection heads, one for image patches, one for language tokens, that map an intermediate layer of the backbone into a shared space. The resulting focus map is supervised by simulator-rendered target-object segmentation masks and applied as a per-patch reweighting of the next step's image tokens. A single scalar readiness threshold, trained jointly from streaming prefixes, decides when the policy should begin acting. On the LIBERO benchmark suite, Premover reduces mean wall-clock time from 34.0 to 29.4 seconds, a 13.6% reduction, while matching the full-prompt baseline's success rate (95.1% vs. 95.0%); naive premoving, by contrast, collapses to 66.4%.
CONov 12, 2021
Sampling from high-dimensional, multimodal distributions using automatically tuned, tempered Hamiltonian Monte CarloJoonha Park
Hamiltonian Monte Carlo (HMC) is widely used for sampling from high dimensional target distributions with densities known up to proportionality. While HMC exhibits favorable scaling properties in high dimensions, it struggles with strongly multimodal distributions. Tempering methods are commonly used to address multimodality, but they can be difficult to tune, especially in high dimensional settings. In this study, we propose a method that combines tempering with HMC to enable efficient sampling from high dimensional, strongly multimodal distributions. Our approach simulates the dynamics of a time-varying Hamiltonian in which the temperature increases and then decreases over time. In the first phase, the simulated trajectory gradually explores low-density regions farther from the mode; the second phase guides it back toward a local mode. We develop efficient tuning strategies based on a time-scale transformation under which the Hamiltonian becomes approximately stationary. This leads to a tempered Hamiltonian Monte Carlo (THMC) algorithm with automatic tuning. We demonstrate numerically that our method scales more effectively with dimension than adaptive parallel tempering and tempered sequential Monte Carlo. Finally, we apply our THMC to sample from strongly multimodal posterior distributions arising in Bayesian inference.
COJul 15, 2019
Markov chain Monte Carlo algorithms with sequential proposalsJoonha Park, Yves F. Atchadé
We explore a general framework in Markov chain Monte Carlo (MCMC) sampling where sequential proposals are tried as a candidate for the next state of the Markov chain. This sequential-proposal framework can be applied to various existing MCMC methods, including Metropolis-Hastings algorithms using random proposals and methods that use deterministic proposals such as Hamiltonian Monte Carlo (HMC) or the bouncy particle sampler. Sequential-proposal MCMC methods construct the same Markov chains as those constructed by the delayed rejection method under certain circumstances. In the context of HMC, the sequential-proposal approach has been proposed as extra chance generalized hybrid Monte Carlo (XCGHMC). We develop two novel methods in which the trajectories leading to proposals in HMC are automatically tuned to avoid doubling back, as in the No-U-Turn sampler (NUTS). The numerical efficiency of these new methods compare favorably to the NUTS. We additionally show that the sequential-proposal bouncy particle sampler enables the constructed Markov chain to pass through regions of low target density and thus facilitates better mixing of the chain when the target density is multimodal.