ROApr 28
Variable Elimination in Hybrid Factor Graphs for Discrete-Continuous Inference & EstimationVarun Agrawal, Frank Dellaert
Many problems in robotics involve both continuous and discrete components, and modeling them together for estimation tasks has been a long standing and difficult problem. Hybrid Factor Graphs give us a mathematical framework to model these types of problems, however existing approaches for solving them are based on approximations. In this work, we propose a new framework for hybrid factor graphs along with a novel variable elimination algorithm to produce a hybrid Bayes network, which can be used for exact Maximum A Posteriori estimation and marginalization over both sets of variables. Our approach first develops a novel hybrid Gaussian factor which can connect to both discrete and continuous variables, and a hybrid conditional which can represent multiple continuous hypotheses conditioned on the discrete variables. Using these representations, we derive the process of hybrid variable elimination under the Conditional Linear Gaussian scheme, giving us exact posteriors as a hybrid Bayes network. To bound the number of discrete hypotheses, we use a tree-structured representation of the factors coupled with a simple pruning and probabilistic assignment scheme, which allows for tractable inference. We demonstrate the applicability of our framework on a large scale SLAM dataset and a real world pose graph optimization problem, both with ambiguous measurements which require discrete choices to be made for the most likely measurements. Our demonstrated results showcase the accuracy, generality, and simplicity of our hybrid factor graph framework.
ROMay 21
Four Simple Proprioceptive Estimators for Legged RobotsFrank Dellaert, Chiyun Noh, Varun Agrawal et al.
Legged robots carry an IMU, but the inertial solution drifts because consumer-grade IMUs are noisy. However, the feet create intermittent contacts with the environment that can be used to mitigate that drift. This report develops a sequence of increasingly expressive legged robot state estimators that leverage this. In all cases, the floating-base state comprises attitude, position, velocity, and IMU biases. To model foot contacts, we start from the contact-aided invariant EKF of Hartley et al., albeit at a reduced contact update rate. This is then augmented by replacing the measurement update by a small factor graph. Finally, we turn the same factors into a fixed-lag smoother with contact-episode footholds, with and without an evolving IMU bias. To facilitate reproducibility and further research in proprioceptive legged odometry, all four variants are available in GTSAM (Dellaert et. al), and we additionally provide a ROS2-compatible implementation.
CVNov 20, 2021
VideoPose: Estimating 6D object pose from videosApoorva Beedu, Zhile Ren, Varun Agrawal et al.
We introduce a simple yet effective algorithm that uses convolutional neural networks to directly estimate object poses from videos. Our approach leverages the temporal information from a video sequence, and is computationally efficient and robust to support robotic and AR domains. Our proposed network takes a pre-trained 2D object detector as input, and aggregates visual features through a recurrent neural network to make predictions at each frame. Experimental evaluation on the YCB-Video dataset show that our approach is on par with the state-of-the-art algorithms. Further, with a speed of 30 fps, it is also more efficient than the state-of-the-art, and therefore applicable to a variety of applications that require real-time object pose estimation.
ROMar 24, 2021
iMHS: An Incremental Multi-Hypothesis SmootherFan Jiang, Varun Agrawal, Russell Buchanan et al.
State estimation of multi-modal hybrid systems is an important problem with many applications in the field robotics. However, incorporating discrete modes in the estimation process is hampered by a potentially combinatorial growth in computation. In this paper we present a novel incremental multi-hypothesis smoother based on eliminating a hybrid factor graph into a multi-hypothesis Bayes tree, which represents possible discrete state sequence hypotheses. Following iSAM, we enable incremental inference by conditioning the past on the future but we add to that the capability of maintaining multiple discrete mode histories, exploiting the temporal structure of the problem to obtain a simplified representation that unifies the multiple hypothesis tree with the Bayes tree. In the results section we demonstrate the generality of the algorithm with examples in three problem domains: lane change detection (1D), aircraft maneuver detection (2D), and contact detection in legged robots (3D).
ROMar 22, 2021
Continuous-time State & Dynamics Estimation using a Pseudo-Spectral ParameterizationVarun Agrawal, Frank Dellaert
We present a novel continuous time trajectory representation based on a Chebyshev polynomial basis, which when governed by known dynamics models, allows for full trajectory and robot dynamics estimation, particularly useful for high-performance robotics applications such as unmanned aerial vehicles. We show that we can gracefully incorporate model dynamics to our trajectory representation, within a factor-graph based framework, and leverage ideas from pseudo-spectral optimal control to parameterize the state and the control trajectories as interpolating polynomials. This allows us to perform efficient optimization at specifically chosen points derived from the theory, while recovering full trajectory estimates. Through simulated experiments we demonstrate the applicability of our representation for accurate flight dynamics estimation for multirotor aerial vehicles. The representation framework is general and can thus be applied to a multitude of high-performance applications beyond multirotor platforms.
CVSep 11, 2018
Unbiasing Semantic Segmentation For Robot Perception using Synthetic Data Feature TransferJonathan C Balloch, Varun Agrawal, Irfan Essa et al.
Robot perception systems need to perform reliable image segmentation in real-time on noisy, raw perception data. State-of-the-art segmentation approaches use large CNN models and carefully constructed datasets; however, these models focus on accuracy at the cost of real-time inference. Furthermore, the standard semantic segmentation datasets are not large enough for training CNNs without augmentation and are not representative of noisy, uncurated robot perception data. We propose improving the performance of real-time segmentation frameworks on robot perception data by transferring features learned from synthetic segmentation data. We show that pretraining real-time segmentation architectures with synthetic segmentation data instead of ImageNet improves fine-tuning performance by reducing the bias learned in pretraining and closing the \textit{transfer gap} as a result. Our experiments show that our real-time robot perception models pretrained on synthetic data outperform those pretrained on ImageNet for every scale of fine-tuning data examined. Moreover, the degree to which synthetic pretraining outperforms ImageNet pretraining increases as the availability of robot data decreases, making our approach attractive for robotics domains where dataset collection is hard and/or expensive.
CVJun 9, 2017
TextureGAN: Controlling Deep Image Synthesis with Texture PatchesWenqi Xian, Patsorn Sangkloy, Varun Agrawal et al.
In this paper, we investigate deep image synthesis guided by sketch, color, and texture. Previous image synthesis methods can be controlled by sketch and color strokes but we are the first to examine texture control. We allow a user to place a texture patch on a sketch at arbitrary locations and scales to control the desired output texture. Our generative network learns to synthesize objects consistent with these texture suggestions. To achieve this, we develop a local texture loss in addition to adversarial and content loss to train the generative network. We conduct experiments using sketches generated from real images and textures sampled from a separate texture database and results show that our proposed algorithm is able to generate plausible images that are faithful to user controls. Ablation studies show that our proposed pipeline can generate more realistic images than adapting existing methods directly.