ROApr 3, 2022Code
Proactive Anomaly Detection for Robot Navigation with Multi-Sensor FusionTianchen Ji, Arun Narenthiran Sivakumar, Girish Chowdhary et al.
Despite the rapid advancement of navigation algorithms, mobile robots often produce anomalous behaviors that can lead to navigation failures. The ability to detect such anomalous behaviors is a key component in modern robots to achieve high-levels of autonomy. Reactive anomaly detection methods identify anomalous task executions based on the current robot state and thus lack the ability to alert the robot before an actual failure occurs. Such an alert delay is undesirable due to the potential damage to both the robot and the surrounding objects. We propose a proactive anomaly detection network (PAAD) for robot navigation in unstructured and uncertain environments. PAAD predicts the probability of future failure based on the planned motions from the predictive controller and the current observation from the perception module. Multi-sensor signals are fused effectively to provide robust anomaly detection in the presence of sensor occlusion as seen in field environments. Our experiments on field robot data demonstrates superior failure identification performance than previous methods, and that our model can capture anomalous behaviors in real-time while maintaining a low false detection rate in cluttered fields. Code, dataset, and video are available at https://github.com/tianchenji/PAAD
LGDec 7, 2022Code
Dynamic Graph Node Classification via Time AugmentationJiarui Sun, Mengting Gu, Chin-Chia Michael Yeh et al.
Node classification for graph-structured data aims to classify nodes whose labels are unknown. While studies on static graphs are prevalent, few studies have focused on dynamic graph node classification. Node classification on dynamic graphs is challenging for two reasons. First, the model needs to capture both structural and temporal information, particularly on dynamic graphs with a long history and require large receptive fields. Second, model scalability becomes a significant concern as the size of the dynamic graph increases. To address these problems, we propose the Time Augmented Dynamic Graph Neural Network (TADGNN) framework. TADGNN consists of two modules: 1) a time augmentation module that captures the temporal evolution of nodes across time structurally, creating a time-augmented spatio-temporal graph, and 2) an information propagation module that learns the dynamic representations for each node across time using the constructed time-augmented graph. We perform node classification experiments on four dynamic graph benchmarks. Experimental results demonstrate that TADGNN framework outperforms several static and dynamic state-of-the-art (SOTA) GNN models while demonstrating superior scalability. We also conduct theoretical and empirical analyses to validate the efficiency of the proposed method. Our code is available at https://sites.google.com/view/tadgnn.
SYJan 16, 2018
Active Sampling-based Binary Verification of Dynamical SystemsJohn F. Quindlen, Ufuk Topcu, Girish Chowdhary et al.
Nonlinear, adaptive, or otherwise complex control techniques are increasingly relied upon to ensure the safety of systems operating in uncertain environments. However, the nonlinearity of the resulting closed-loop system complicates verification that the system does in fact satisfy those requirements at all possible operating conditions. While analytical proof-based techniques and finite abstractions can be used to provably verify the closed-loop system's response at different operating conditions, they often produce conservative approximations due to restrictive assumptions and are difficult to construct in many applications. In contrast, popular statistical verification techniques relax the restrictions and instead rely upon simulations to construct statistical or probabilistic guarantees. This work presents a data-driven statistical verification procedure that instead constructs statistical learning models from simulated training data to separate the set of possible perturbations into "safe" and "unsafe" subsets. Binary evaluations of closed-loop system requirement satisfaction at various realizations of the uncertainties are obtained through temporal logic robustness metrics, which are then used to construct predictive models of requirement satisfaction over the full set of possible uncertainties. As the accuracy of these predictive statistical models is inherently coupled to the quality of the training data, an active learning algorithm selects additional sample points in order to maximize the expected change in the data-driven model and thus, indirectly, minimize the prediction error. Various case studies demonstrate the closed-loop verification procedure and highlight improvements in prediction error over both existing analytical and statistical verification techniques.
ROMar 22, 2022
WayFAST: Navigation with Predictive Traversability in the FieldMateus Valverde Gasparino, Arun Narenthiran Sivakumar, Yixiao Liu et al.
We present a self-supervised approach for learning to predict traversable paths for wheeled mobile robots that require good traction to navigate. Our algorithm, termed WayFAST (Waypoint Free Autonomous Systems for Traversability), uses RGB and depth data, along with navigation experience, to autonomously generate traversable paths in outdoor unstructured environments. Our key inspiration is that traction can be estimated for rolling robots using kinodynamic models. Using traction estimates provided by an online receding horizon estimator, we are able to train a traversability prediction neural network in a self-supervised manner, without requiring heuristics utilized by previous methods. We demonstrate the effectiveness of WayFAST through extensive field testing in varying environments, ranging from sandy dry beaches to forest canopies and snow covered grass fields. Our results clearly demonstrate that WayFAST can learn to avoid geometric obstacles as well as untraversable terrain, such as snow, which would be difficult to avoid with sensors that provide only geometric data, such as LiDAR. Furthermore, we show that our training pipeline based on online traction estimates is more data-efficient than other heuristic-based methods.
SYOct 1, 2017
Active Sampling for Constrained Simulation-based Verification of Uncertain Nonlinear SystemsJohn F. Quindlen, Ufuk Topcu, Girish Chowdhary et al.
Increasingly demanding performance requirements for dynamical systems motivates the adoption of nonlinear and adaptive control techniques. One challenge is the nonlinearity of the resulting closed-loop system complicates verification that the system does satisfy the requirements at all possible operating conditions. This paper presents a data-driven procedure for efficient simulation-based, statistical verification without the reliance upon exhaustive simulations. In contrast to previous work, this approach introduces a method for online estimation of prediction accuracy without the use of external validation sets. This work also develops a novel active sampling algorithm that iteratively selects additional training points in order to maximize the accuracy of the predictions while still limited to a sample budget. Three case studies demonstrate the utility of the new approach and the results show up to a 50% improvement over state-of-the-art techniques.
SYOct 1, 2017
Closed-Loop Statistical Verification of Stochastic Nonlinear Systems Subject to Parametric UncertaintiesJohn F. Quindlen, Ufuk Topcu, Girish Chowdhary et al.
This paper proposes a statistical verification framework using Gaussian processes (GPs) for simulation-based verification of stochastic nonlinear systems with parametric uncertainties. Given a small number of stochastic simulations, the proposed framework constructs a GP regression model and predicts the system's performance over the entire set of possible uncertainties. Included in the framework is a new metric to estimate the confidence in those predictions based on the variance of the GP's cumulative distribution function. This variance-based metric forms the basis of active sampling algorithms that aim to minimize prediction error through careful selection of simulations. In three case studies, the new active sampling algorithms demonstrate up to a 35% improvement in prediction error over other approaches and are able to correctly identify regions with low prediction confidence through the variance metric.
CVNov 21, 2022
Last-Mile Embodied Visual NavigationJustin Wasserman, Karmesh Yadav, Girish Chowdhary et al.
Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases. Assigned with an image of the goal, an embodied agent must explore to discover the goal, i.e., search efficiently using learned priors. Once the goal is discovered, the agent must accurately calibrate the last-mile of navigation to the goal. As with any robust system, switches between exploratory goal discovery and exploitative last-mile navigation enable better recovery from errors. Following these intuitive guide rails, we propose SLING to improve the performance of existing image-goal navigation systems. Entirely complementing prior methods, we focus on last-mile navigation and leverage the underlying geometric structure of the problem with neural descriptors. With simple but effective switches, we can easily connect SLING with heuristic, reinforcement learning, and neural modular policies. On a standardized image-goal navigation benchmark (Hahn et al. 2021), we improve performance across policies, scenes, and episode complexity, raising the state-of-the-art from 45% to 55% success rate. Beyond photorealistic simulation, we conduct real-robot experiments in three physical scenes and find these improvements to transfer well to real environments.
CLAug 25, 2022
Multimedia Generative Script Learning for Task PlanningQingyun Wang, Manling Li, Hou Pong Chan et al.
Goal-oriented generative script learning aims to generate subsequent steps to reach a particular goal, which is an essential task to assist robots or humans in performing stereotypical activities. An important aspect of this process is the ability to capture historical states visually, which provides detailed information that is not covered by text and will guide subsequent steps. Therefore, we propose a new task, Multimedia Generative Script Learning, to generate subsequent steps by tracking historical states in both text and vision modalities, as well as presenting the first benchmark containing 5,652 tasks and 79,089 multimedia steps. This task is challenging in three aspects: the multimedia challenge of capturing the visual states in images, the induction challenge of performing unseen tasks, and the diversity challenge of covering different information in individual steps. We propose to encode visual state changes through a selective multimedia encoder to address the multimedia challenge, transfer knowledge from previously observed tasks using a retrieval-augmented decoder to overcome the induction challenge, and further present distinct information at each step by optimizing a diversity-oriented contrastive learning objective. We define metrics to evaluate both generation and inductive quality. Experiment results demonstrate that our approach significantly outperforms strong baselines.
SYFeb 21, 2019
Hybrid Direct-Indirect Adaptive Control of Nonlinear System with Unmatched UncertaintyGirish Joshi, Girish Chowdhary
In this paper, we present a hybrid direct-indirect model reference adaptive controller (MRAC), to address a class of problems with matched and unmatched uncertainties. In the proposed architecture, the unmatched uncertainty is estimated online through a companion observer model. Upon convergence of the observer, the unmatched uncertainty estimate is remodeled into a state dependent linear form to augment the nominal system dynamics. Meanwhile, a direct adaptive controller designed for a switching system cancels the effect of matched uncertainty in the system and achieves reference model tracking. We demonstrate that the proposed hybrid controller can handle a broad class of nonlinear systems with both matched and unmatched uncertainties
LGApr 22, 2023
Unmatched uncertainty mitigation through neural network supported model predictive controlMateus V. Gasparino, Prabhat K. Mishra, Girish Chowdhary
This paper presents a deep learning based model predictive control (MPC) algorithm for systems with unmatched and bounded state-action dependent uncertainties of unknown structure. We utilize a deep neural network (DNN) as an oracle in the underlying optimization problem of learning based MPC (LBMPC) to estimate unmatched uncertainties. Generally, non-parametric oracles such as DNN are considered difficult to employ with LBMPC due to the technical difficulties associated with estimation of their coefficients in real time. We employ a dual-timescale adaptation mechanism, where the weights of the last layer of the neural network are updated in real time while the inner layers are trained on a slower timescale using the training data collected online and selectively stored in a buffer. Our results are validated through a numerical experiment on the compression system model of jet engine. These results indicate that the proposed approach is implementable in real time and carries the theoretical guarantees of LBMPC.
ROJan 5, 2023
Reinforcement Learning-Based Air Traffic DeconflictionDenis Osipychev, Dragos Margineantu, Girish Chowdhary
Remain Well Clear, keeping the aircraft away from hazards by the appropriate separation distance, is an essential technology for the safe operation of uncrewed aerial vehicles in congested airspace. This work focuses on automating the horizontal separation of two aircraft and presents the obstacle avoidance problem as a 2D surrogate optimization task. By our design, the surrogate task is made more conservative to guarantee the execution of the solution in the primary domain. Using Reinforcement Learning (RL), we optimize the avoidance policy and model the dynamics, interactions, and decision-making. By recursively sampling the resulting policy and the surrogate transitions, the system translates the avoidance policy into a complete avoidance trajectory. Then, the solver publishes the trajectory as a set of waypoints for the airplane to follow using the Robot Operating System (ROS) interface. The proposed system generates a quick and achievable avoidance trajectory that satisfies the safety requirements. Evaluation of our system is completed in a high-fidelity simulation and full-scale airplane demonstration. Moreover, the paper concludes an enormous integration effort that has enabled a real-life demonstration of the RL-based system.
CVNov 6, 2023
Exploitation-Guided Exploration for Semantic Embodied NavigationJustin Wasserman, Girish Chowdhary, Abhinav Gupta et al.
In the recent progress in embodied navigation and sim-to-robot transfer, modular policies have emerged as a de facto framework. However, there is more to compositionality beyond the decomposition of the learning load into modular components. In this work, we investigate a principled way to syntactically combine these components. Particularly, we propose Exploitation-Guided Exploration (XGX) where separate modules for exploration and exploitation come together in a novel and intuitive manner. We configure the exploitation module to take over in the deterministic final steps of navigation i.e. when the goal becomes visible. Crucially, an exploitation module teacher-forces the exploration module and continues driving an overridden policy optimization. XGX, with effective decomposition and novel guidance, improves the state-of-the-art performance on the challenging object navigation task from 70% to 73%. Along with better accuracy, through targeted analysis, we show that XGX is also more efficient at goal-conditioned exploration. Finally, we show sim-to-real transfer to robot hardware and XGX performs over two-fold better than the best baseline from simulation benchmarking. Project page: xgxvisnav.github.io
LGSep 26, 2023
Revealing the Power of Masked Autoencoders in Traffic ForecastingJiarui Sun, Yujie Fan, Chin-Chia Michael Yeh et al.
Traffic forecasting, crucial for urban planning, requires accurate predictions of spatial-temporal traffic patterns across urban areas. Existing research mainly focuses on designing complex models that capture spatial-temporal dependencies among variables explicitly. However, this field faces challenges related to data scarcity and model stability, which results in limited performance improvement. To address these issues, we propose Spatial-Temporal Masked AutoEncoders (STMAE), a plug-and-play framework designed to enhance existing spatial-temporal models on traffic prediction. STMAE consists of two learning stages. In the pretraining stage, an encoder processes partially visible traffic data produced by a dual-masking strategy, including biased random walk-based spatial masking and patch-based temporal masking. Subsequently, two decoders aim to reconstruct the masked counterparts from both spatial and temporal perspectives. The fine-tuning stage retains the pretrained encoder and integrates it with decoders from existing backbones to improve forecasting accuracy. Our results on traffic benchmarks show that STMAE can largely enhance the forecasting capabilities of various spatial-temporal models.
CVSep 2, 2024Code
MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement LearningJiarui Sun, M. Ugur Akcal, Wei Zhang et al.
In visual Reinforcement Learning (RL), learning from pixel-based observations poses significant challenges on sample efficiency, primarily due to the complexity of extracting informative state representations from high-dimensional data. Previous methods such as contrastive-based approaches have made strides in improving sample efficiency but fall short in modeling the nuanced evolution of states. To address this, we introduce MOOSS, a novel framework that leverages a temporal contrastive objective with the help of graph-based spatial-temporal masking to explicitly model state evolution in visual RL. Specifically, we propose a self-supervised dual-component strategy that integrates (1) a graph construction of pixel-based observations for spatial-temporal masking, coupled with (2) a multi-level contrastive learning mechanism that enriches state representations by emphasizing temporal continuity and change of states. MOOSS advances the understanding of state dynamics by disrupting and learning from spatial-temporal correlations, which facilitates policy learning. Our comprehensive evaluation on multiple continuous and discrete control benchmarks shows that MOOSS outperforms previous state-of-the-art visual RL methods in terms of sample efficiency, demonstrating the effectiveness of our method. Our code is released at https://github.com/jsun57/MOOSS.
31.2MAMar 16
S2Act: Simple Spiking ActorUgur Akcal, Seung Hyun Kim, Mikihisa Yuasa et al.
Spiking neural networks (SNNs) and biologically-inspired learning mechanisms are attractive in mobile robotics, where the size and performance of onboard neural network policies are constrained by power and computational budgets. Existing SNN approaches, such as population coding, reward modulation, and hybrid artificial neural network (ANN)-SNN architectures, have shown promising results; however, they face challenges in complex, highly stochastic environments due to SNN sensitivity to hyperparameters and inconsistent gradient signals. To address these challenges, we propose simple spiking actor (S2Act), a computationally lightweight framework that deploys an RL policy using an SNN in three steps: (1) architect an actor-critic model based on an approximated network of rate-based spiking neurons, (2) train the network with gradients using compatible activation functions, and (3) transfer the trained weights into physical parameters of rate-based leaky integrate-and-fire (LIF) neurons for inference and deployment. By globally shaping LIF neuron parameters such that their rate-based responses approximate ReLU activations, S2Act effectively mitigates the vanishing gradient problem, while pre-constraining LIF response curves reduces reliance on complex SNN-specific hyperparameter tuning. We demonstrate our method in two multi-agent stochastic environments (capture-the-flag and parking) that capture the complexity of multi-robot interactions, and deploy our trained policies on physical TurtleBot platforms using Intel's Loihi neuromorphic hardware. Our experimental results show that S2Act outperforms relevant baselines in task performance and real-time inference in nearly all considered scenarios, highlighting its potential for rapid prototyping and efficient real-world deployment of SNN-based RL policies.
79.4ROMar 24
CATNAV: Cached Vision-Language Traversability for Efficient Zero-Shot Robot NavigationAditya Potnis, Francisco Affonso, Shreya Gummadi et al.
Navigating unstructured environments requires assessing traversal risk relative to a robot's physical capabilities, a challenge that varies across embodiments. We present CATNAV, a cost-aware traversability navigation framework that leverages multimodal LLMs for zero-shot, embodiment-aware costmap generation without task-specific training. We introduce a visuosemantic caching mechanism that detects scene novelty and reuses prior risk assessments for semantically similar frames, reducing online VLM queries by 85.7%. Furthermore, we introduce a VLM-based trajectory selection module that evaluates proposals through visual reasoning to choose the safest path given behavioral constraints. We evaluate CATNAV on a quadruped robot across indoor and outdoor unstructured environments, comparing against state-of-the-art vision-language-action baselines. Across five navigation tasks, CATNAV achieves 10 percentage point higher average goal-reaching rate and 33% fewer behavioral constraint violations.
56.7ROMar 22
HyReach: Vision-Guided Hybrid Manipulator Reaching in Unseen Cluttered EnvironmentsShivani Kamtikar, Kendall Koe, Justin Wasserman et al.
As robotic systems increasingly operate in unstructured, cluttered, and previously unseen environments, there is a growing need for manipulators that combine compliance, adaptability, and precise control. This work presents a real-time hybrid rigid-soft continuum manipulator system designed for robust open-world object reaching in such challenging environments. The system integrates vision-based perception and 3D scene reconstruction with shape-aware motion planning to generate safe trajectories. A learning-based controller drives the hybrid arm to arbitrary target poses, leveraging the flexibility of the soft segment while maintaining the precision of the rigid segment. The system operates without environment-specific retraining, enabling direct generalization to new scenes. Extensive real-world experiments demonstrate consistent reaching performance with errors below 2 cm across diverse cluttered setups, highlighting the potential of hybrid manipulators for adaptive and reliable operation in unstructured environments.
ROJul 23, 2024
PlantTrack: Task-Driven Plant Keypoint Tracking with Zero-Shot Sim2Real TransferSamhita Marri, Arun N. Sivakumar, Naveen K. Uppalapati et al.
Tracking plant features is crucial for various agricultural tasks like phenotyping, pruning, or harvesting, but the unstructured, cluttered, and deformable nature of plant environments makes it a challenging task. In this context, the recent advancements in foundational models show promise in addressing this challenge. In our work, we propose PlantTrack where we utilize DINOv2 which provides high-dimensional features, and train a keypoint heatmap predictor network to identify the locations of semantic features such as fruits and leaves which are then used as prompts for point tracking across video frames using TAPIR. We show that with as few as 20 synthetic images for training the keypoint predictor, we achieve zero-shot Sim2Real transfer, enabling effective tracking of plant features in real environments.
CVMay 21, 2023Code
CoMusion: Towards Consistent Stochastic Human Motion Prediction via Motion DiffusionJiarui Sun, Girish Chowdhary
Stochastic Human Motion Prediction (HMP) aims to predict multiple possible future human pose sequences from observed ones. Most prior works learn motion distributions through encoding-decoding in the latent space, which does not preserve motion's spatial-temporal structure. While effective, these methods often require complex, multi-stage training and yield predictions that are inconsistent with the provided history and can be physically unrealistic. To address these issues, we propose CoMusion, a single-stage, end-to-end diffusion-based stochastic HMP framework. CoMusion is inspired from the insight that a smooth future pose initialization improves prediction performance, a strategy not previously utilized in stochastic models but evidenced in deterministic works. To generate such initialization, CoMusion's motion predictor starts with a Transformer-based network for initial reconstruction of corrupted motion. Then, a graph convolutional network (GCN) is employed to refine the prediction considering past observations in the discrete cosine transformation (DCT) space. Our method, facilitated by the Transformer-GCN module design and a proposed variance scheduler, excels in predicting accurate, realistic, and consistent motions, while maintaining appropriate diversity. Experimental results on benchmark datasets demonstrate that CoMusion surpasses prior methods across metrics, while demonstrating superior generation quality. Our Code is released at https://github.com/jsun57/CoMusion/ .
ROSep 17, 2020Code
Elastica: A compliant mechanics environment for soft robotic controlNoel Naughton, Jiarui Sun, Arman Tekinalp et al.
Soft robots are notoriously hard to control. This is partly due to the scarcity of models able to capture their complex continuum mechanics, resulting in a lack of control methodologies that take full advantage of body compliance. Currently available simulation methods are either too computational demanding or overly simplistic in their physical assumptions, leading to a paucity of available simulation resources for developing such control schemes. To address this, we introduce Elastica, a free, open-source simulation environment for soft, slender rods that can bend, twist, shear and stretch. We demonstrate how Elastica can be coupled with five state-of-the-art reinforcement learning algorithms to successfully control a soft, compliant robotic arm and complete increasingly challenging tasks.
84.8ROMay 8
Failing Forward: Adaptive Failure-Informed Learning for Vision-Language-Action ModelsMeng Zheng, Samhita Marri, Anwesa Choudhuri et al.
Vision-language-action (VLA) models provide a promising paradigm for scalable robotic manipulation, yet their reliance on success-only behavioral cloning leaves them brittle; lacking corrective training signals, minor execution errors rapidly compound into unrecoverable, out-of-distribution failures. To address this limitation, we propose Adaptive Failure-Informed Learning (AFIL), an end-to-end framework that leverages failure trajectories as adaptive negative guidance for diffusion- and flow-based VLA policies. AFIL uses a pretrained VLA to generate failure rollouts online, avoiding the need for handcrafted failure-mode design or human-in-the-loop recovery. It then jointly trains Dual Action Generators (DAGs) for successful and failed behaviors while sharing a common vision-language backbone, enabling efficient failure-aware policy learning with limited parameter overhead. During sampling, the failure generator adaptively steers action generation away from failure-prone regions and toward more reliable success modes, with guidance strength determined by the per-diffusion-step distance between success and failure distributions. Experiments across in-domain and out-of-domain robotic manipulation tasks, covering both short- and long-horizon settings, show that AFIL consistently improves task success rates and robustness over existing VLA baselines, demonstrating its effectiveness, efficiency, and generality.
SDFeb 15, 2024
DeepSRGM -- Sequence Classification and Ranking in Indian Classical Music with Deep LearningSathwik Tejaswi Madhusudhan, Girish Chowdhary
A vital aspect of Indian Classical Music (ICM) is Raga, which serves as a melodic framework for compositions and improvisations alike. Raga Recognition is an important music information retrieval task in ICM as it can aid numerous downstream applications ranging from music recommendations to organizing huge music collections. In this work, we propose a deep learning based approach to Raga recognition. Our approach employs efficient pre possessing and learns temporal sequences in music data using Long Short Term Memory based Recurrent Neural Networks (LSTM-RNN). We train and test the network on smaller sequences sampled from the original audio while the final inference is performed on the audio as a whole. Our method achieves an accuracy of 88.1% and 97 % during inference on the Comp Music Carnatic dataset and its 10 Raga subset respectively making it the state-of-the-art for the Raga recognition task. Our approach also enables sequence ranking which aids us in retrieving melodic patterns from a given music data base that are closely related to the presented query sequence.
IVMay 22, 2024
Hyperspectral Image Reconstruction for Predicting Chick Embryo Mortality Towards Advancing Egg and Hatchery IndustryMd. Toukir Ahmed, Md Wadud Ahmed, Ocean Monjur et al.
As the demand for food surges and the agricultural sector undergoes a transformative shift towards sustainability and efficiency, the need for precise and proactive measures to ensure the health and welfare of livestock becomes paramount. In the context of the broader agricultural landscape outlined, the application of Hyperspectral Imaging (HSI) takes on profound significance. HSI has emerged as a cutting-edge, non-destructive technique for fast and accurate egg quality analysis, including the detection of chick embryo mortality. However, the high cost and operational complexity compared to conventional RGB imaging are significant bottlenecks in the widespread adoption of HSI technology. To overcome these hurdles and unlock the full potential of HSI, a promising solution is hyperspectral image reconstruction from standard RGB images. This study aims to reconstruct hyperspectral images from RGB images for non-destructive early prediction of chick embryo mortality. Firstly, the performance of different image reconstruction algorithms, such as HRNET, MST++, Restormer, and EDSR were compared to reconstruct the hyperspectral images of the eggs in the early incubation period. Later, the reconstructed spectra were used to differentiate live from dead chick-producing eggs using the XGBoost and Random Forest classification methods. Among the reconstruction methods, HRNET showed impressive reconstruction performance with MRAE of 0.0955, RMSE of 0.0159, and PSNR of 36.79 dB. This study motivated that harnessing imaging technology integrated with smart sensors and data analytics has the potential to improve automation, enhance biosecurity, and optimize resource management towards sustainable agriculture 4.0.
RONov 6, 2024
Fed-EC: Bandwidth-Efficient Clustering-Based Federated Learning For Autonomous Visual Robot NavigationShreya Gummadi, Mateus V. Gasparino, Deepak Vasisht et al.
Centralized learning requires data to be aggregated at a central server, which poses significant challenges in terms of data privacy and bandwidth consumption. Federated learning presents a compelling alternative, however, vanilla federated learning methods deployed in robotics aim to learn a single global model across robots that works ideally for all. But in practice one model may not be well suited for robots deployed in various environments. This paper proposes Federated-EmbedCluster (Fed-EC), a clustering-based federated learning framework that is deployed with vision based autonomous robot navigation in diverse outdoor environments. The framework addresses the key federated learning challenge of deteriorating model performance of a single global model due to the presence of non-IID data across real-world robots. Extensive real-world experiments validate that Fed-EC reduces the communication size by 23x for each robot while matching the performance of centralized learning for goal-oriented navigation and outperforms local learning. Fed-EC can transfer previously learnt models to new robots that join the cluster.
CVApr 4, 2024
OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and CaptioningAnwesa Choudhuri, Girish Chowdhary, Alexander G. Schwing
We propose the new task 'open-world video instance segmentation and captioning'. It requires to detect, segment, track and describe with rich captions never before seen objects. This challenging task can be addressed by developing "abstractors" which connect a vision model and a language foundation model. Concretely, we connect a multi-scale visual feature extractor and a large language model (LLM) by developing an object abstractor and an object-to-text abstractor. The object abstractor, consisting of a prompt encoder and transformer blocks, introduces spatially-diverse open-world object queries to discover never before seen objects in videos. An inter-query contrastive loss further encourages the diversity of object queries. The object-to-text abstractor is augmented with masked cross-attention and acts as a bridge between the object queries and a frozen LLM to generate rich and descriptive object-centric captions for each detected object. Our generalized approach surpasses the baseline that jointly addresses the tasks of open-world video instance segmentation and dense video object captioning by 13% on never before seen objects, and by 10% on object-centric captions.
ROApr 26, 2024
Lessons from Deploying CropFollow++: Under-Canopy Agricultural Navigation with KeypointsArun N. Sivakumar, Mateus V. Gasparino, Michael McGuire et al.
We present a vision-based navigation system for under-canopy agricultural robots using semantic keypoints. Autonomous under-canopy navigation is challenging due to the tight spacing between the crop rows ($\sim 0.75$ m), degradation in RTK-GPS accuracy due to multipath error, and noise in LiDAR measurements from the excessive clutter. Our system, CropFollow++, introduces modular and interpretable perception architecture with a learned semantic keypoint representation. We deployed CropFollow++ in multiple under-canopy cover crop planting robots on a large scale (25 km in total) in various field conditions and we discuss the key lessons learned from this.
ROFeb 4, 2024
Gazebo Plants: Simulating Plant-Robot Interaction with Cosserat RodsJunchen Deng, Samhita Marri, Jonathan Klein et al.
Robotic harvesting has the potential to positively impact agricultural productivity, reduce costs, improve food quality, enhance sustainability, and to address labor shortage. In the rapidly advancing field of agricultural robotics, the necessity of training robots in a virtual environment has become essential. Generating training data to automatize the underlying computer vision tasks such as image segmentation, object detection and classification, also heavily relies on such virtual environments as synthetic data is often required to overcome the shortage and lack of variety of real data sets. However, physics engines commonly employed within the robotics community, such as ODE, Simbody, Bullet, and DART, primarily support motion and collision interaction of rigid bodies. This inherent limitation hinders experimentation and progress in handling non-rigid objects such as plants and crops. In this contribution, we present a plugin for the Gazebo simulation platform based on Cosserat rods to model plant motion. It enables the simulation of plants and their interaction with the environment. We demonstrate that, using our plugin, users can conduct harvesting simulations in Gazebo by simulating a robotic arm picking fruits and achieve results comparable to real-world experiments.
ROSep 8, 2025
Learning to Walk with Less: a Dyna-Style Approach to Quadrupedal LocomotionFrancisco Affonso, Felipe Andrade G. Tommaselli, Juliano Negri et al.
Traditional RL-based locomotion controllers often suffer from low data efficiency, requiring extensive interaction to achieve robust performance. We present a model-based reinforcement learning (MBRL) framework that improves sample efficiency for quadrupedal locomotion by appending synthetic data to the end of standard rollouts in PPO-based controllers, following the Dyna-Style paradigm. A predictive model, trained alongside the policy, generates short-horizon synthetic transitions that are gradually integrated using a scheduling strategy based on the policy update iterations. Through an ablation study, we identified a strong correlation between sample efficiency and rollout length, which guided the design of our experiments. We validated our approach in simulation on the Unitree Go1 robot and showed that replacing part of the simulated steps with synthetic ones not only mimics extended rollouts but also improves policy return and reduces variance. Finally, we demonstrate that this improvement transfers to the ability to track a wide range of locomotion commands using fewer simulated steps.
LGJan 31, 2025
Physics-Informed Neural Network based Damage Identification for Truss Railroad BridgesAlthaf Shajihan, Kirill Mechitov, Girish Chowdhary et al.
Railroad bridges are a crucial component of the U.S. freight rail system, which moves over 40 percent of the nation's freight and plays a critical role in the economy. However, aging bridge infrastructure and increasing train traffic pose significant safety hazards and risk service disruptions. The U.S. rail network includes over 100,000 railroad bridges, averaging one every 1.4 miles of track, with steel bridges comprising over 50% of the network's total bridge length. Early identification and assessment of damage in these bridges remain challenging tasks. This study proposes a physics-informed neural network (PINN) based approach for damage identification in steel truss railroad bridges. The proposed approach employs an unsupervised learning approach, eliminating the need for large datasets typically required by supervised methods. The approach utilizes train wheel load data and bridge response during train crossing events as inputs for damage identification. The PINN model explicitly incorporates the governing differential equations of the linear time-varying (LTV) bridge-train system. Herein, this model employs a recurrent neural network (RNN) based architecture incorporating a custom Runge-Kutta (RK) integrator cell, designed for gradient-based learning. The proposed approach updates the bridge finite element model while also quantifying damage severity and localizing the affected structural members. A case study on the Calumet Bridge in Chicago, Illinois, with simulated damage scenarios, is used to demonstrate the model's effectiveness in identifying damage while maintaining low false-positive rates. Furthermore, the damage identification pipeline is designed to seamlessly integrate prior knowledge from inspections and drone surveys, also enabling context-aware updating and assessment of bridge's condition.
SYNov 21, 2025
Algorithmic design and implementation considerations of deep MPCPrabhat K. Mishra, Mateus V. Gasparino, Girish Chowdhary
Deep Model Predictive Control (Deep MPC) is an evolving field that integrates model predictive control and deep learning. This manuscript is focused on a particular approach, which employs deep neural network in the loop with MPC. This class of approaches distributes control authority between a neural network and an MPC controller, in such a way that the neural network learns the model uncertainties while the MPC handles constraints. The approach is appealing because training data collected while the system is in operation can be used to fine-tune the neural network, and MPC prevents unsafe behavior during those learning transients. This manuscript explains implementation challenges of Deep MPC, algorithmic way to distribute control authority and argues that a poor choice in distributing control authority may lead to poor performance. A reason of poor performance is explained through a numerical experiment on a four-wheeled skid-steer dynamics.
ROAug 26, 2025
ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown EnvironmentsShreya Gummadi, Mateus V. Gasparino, Gianluca Capezzuto et al.
The advancement of robotics and autonomous navigation systems hinges on the ability to accurately predict terrain traversability. Traditional methods for generating datasets to train these prediction models often involve putting robots into potentially hazardous environments, posing risks to equipment and safety. To solve this problem, we present ZeST, a novel approach leveraging visual reasoning capabilities of Large Language Models (LLMs) to create a traversability map in real-time without exposing robots to danger. Our approach not only performs zero-shot traversability and mitigates the risks associated with real-world data collection but also accelerates the development of advanced navigation systems, offering a cost-effective and scalable solution. To support our findings, we present navigation results, in both controlled indoor and unstructured outdoor environments. As shown in the experiments, our method provides safer navigation when compared to other state-of-the-art methods, constantly reaching the final goal.
LGMar 13, 2025
Towards Efficient Large Scale Spatial-Temporal Time Series Forecasting via Improved Inverted TransformersJiarui Sun, Chin-Chia Michael Yeh, Yujie Fan et al.
Time series forecasting at scale presents significant challenges for modern prediction systems, particularly when dealing with large sets of synchronized series, such as in a global payment network. In such systems, three key challenges must be overcome for accurate and scalable predictions: 1) emergence of new entities, 2) disappearance of existing entities, and 3) the large number of entities present in the data. The recently proposed Inverted Transformer (iTransformer) architecture has shown promising results by effectively handling variable entities. However, its practical application in large-scale settings is limited by quadratic time and space complexity ($O(N^2)$) with respect to the number of entities $N$. In this paper, we introduce EiFormer, an improved inverted transformer architecture that maintains the adaptive capabilities of iTransformer while reducing computational complexity to linear scale ($O(N)$). Our key innovation lies in restructuring the attention mechanism to eliminate redundant computations without sacrificing model expressiveness. Additionally, we incorporate a random projection mechanism that not only enhances efficiency but also improves prediction accuracy through better feature representation. Extensive experiments on the public LargeST benchmark dataset and a proprietary large-scale time series dataset demonstrate that EiFormer significantly outperforms existing methods in both computational efficiency and forecasting accuracy. Our approach enables practical deployment of transformer-based forecasting in industrial applications where handling time series at scale is essential.
RONov 21, 2024
MetaCropFollow: Few-Shot Adaptation with Meta-Learning for Under-Canopy NavigationThomas Woehrle, Arun N. Sivakumar, Naveen Uppalapati et al.
Autonomous under-canopy navigation faces additional challenges compared to over-canopy settings - for example the tight spacing between the crop rows, degraded GPS accuracy and excessive clutter. Keypoint-based visual navigation has been shown to perform well in these conditions, however the differences between agricultural environments in terms of lighting, season, soil and crop type mean that a domain shift will likely be encountered at some point of the robot deployment. In this paper, we explore the use of Meta-Learning to overcome this domain shift using a minimal amount of data. We train a base-learner that can quickly adapt to new conditions, enabling more robust navigation in low-data regimes.
ROOct 16, 2024
AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy NavigationArun N. Sivakumar, Federico Magistri, Mateus V. Gasparino et al.
Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks throughout the growing season. Autonomous navigation under the canopy is challenging due to the degradation in accuracy of RTK-GPS and the large variability in the visual appearance of the scene over time. In prior work, we developed a supervised learning-based perception system with semantic keypoint representation and deployed this in various field conditions. A large number of failures of this system can be attributed to the inability of the perception model to adapt to the domain shift encountered during deployment. In this paper, we propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. Our preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using our method. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
CVMay 8, 2023
Towards Accurate Human Motion Prediction via Iterative RefinementJiarui Sun, Girish Chowdhary
Human motion prediction aims to forecast an upcoming pose sequence given a past human motion trajectory. To address the problem, in this work we propose FreqMRN, a human motion prediction framework that takes into account both the kinematic structure of the human body and the temporal smoothness nature of motion. Specifically, FreqMRN first generates a fixed-size motion history summary using a motion attention module, which helps avoid inaccurate motion predictions due to excessively long motion inputs. Then, supervised by the proposed spatial-temporal-aware, velocity-aware and global-smoothness-aware losses, FreqMRN iteratively refines the predicted motion though the proposed motion refinement module, which converts motion representations back and forth between pose space and frequency space. We evaluate FreqMRN on several standard benchmark datasets, including Human3.6M, AMASS and 3DPW. Experimental results demonstrate that FreqMRN outperforms previous methods by large margins for both short-term and long-term predictions, while demonstrating superior robustness.
ROFeb 10, 2022
Visual Servoing for Pose Control of Soft Continuum Arm in a Structured EnvironmentShivani Kamtikar, Samhita Marri, Benjamin Walt et al.
For soft continuum arms, visual servoing is a popular control strategy that relies on visual feedback to close the control loop. However, robust visual servoing is challenging as it requires reliable feature extraction from the image, accurate control models and sensors to perceive the shape of the arm, both of which can be hard to implement in a soft robot. This letter circumvents these challenges by presenting a deep neural network-based method to perform smooth and robust 3D positioning tasks on a soft arm by visual servoing using a camera mounted at the distal end of the arm. A convolutional neural network is trained to predict the actuations required to achieve the desired pose in a structured environment. Integrated and modular approaches for estimating the actuations from the image are proposed and are experimentally compared. A proportional control law is implemented to reduce the error between the desired and current image as seen by the camera. The model together with the proportional feedback control makes the described approach robust to several variations such as new targets, lighting, loads, and diminution of the soft arm. Furthermore, the model lends itself to be transferred to a new environment with minimal effort.
SYAug 4, 2021
Stochastic Deep Model Reference Adaptive ControlGirish Joshi, Girish Chowdhary
In this paper, we present a Stochastic Deep Neural Network-based Model Reference Adaptive Control. Building on our work "Deep Model Reference Adaptive Control", we extend the controller capability by using Bayesian deep neural networks (DNN) to represent uncertainties and model non-linearities. Stochastic Deep Model Reference Adaptive Control uses a Lyapunov-based method to adapt the output-layer weights of the DNN model in real-time, while a data-driven supervised learning algorithm is used to update the inner-layers parameters. This asynchronous network update ensures boundedness and guaranteed tracking performance with a learning-based real-time feedback controller. A Bayesian approach to DNN learning helped avoid over-fitting the data and provide confidence intervals over the predictions. The controller's stochastic nature also ensured "Induced Persistency of excitation," leading to convergence of the overall system signal.
ROJul 6, 2021
Learned Visual Navigation for Under-Canopy Agricultural RobotsArun Narenthiran Sivakumar, Sahil Modi, Mateus Valverde Gasparino et al.
We describe a system for visually guided autonomous navigation of under-canopy farm robots. Low-cost under-canopy robots can drive between crop rows under the plant canopy and accomplish tasks that are infeasible for over-the-canopy drones or larger agricultural equipment. However, autonomously navigating them under the canopy presents a number of challenges: unreliable GPS and LiDAR, high cost of sensing, challenging farm terrain, clutter due to leaves and weeds, and large variability in appearance over the season and across crop types. We address these challenges by building a modular system that leverages machine learning for robust and generalizable perception from monocular RGB images from low-cost cameras, and model predictive control for accurate control in challenging terrain. Our system, CropFollow, is able to autonomously drive 485 meters per intervention on average, outperforming a state-of-the-art LiDAR based system (286 meters per intervention) in extensive field testing spanning over 25 km.
ROJun 28, 2021
Multi-Sensor Fusion based Robust Row Following for Compact Agricultural RobotsAndres Eduardo Baquero Velasquez, Vitor Akihiro Hisano Higuti, Mateus Valverde Gasparino et al.
This paper presents a state-of-the-art LiDAR based autonomous navigation system for under-canopy agricultural robots. Under-canopy agricultural navigation has been a challenging problem because GNSS and other positioning sensors are prone to significant errors due to attentuation and multi-path caused by crop leaves and stems. Reactive navigation by detecting crop rows using LiDAR measurements is a better alternative to GPS but suffers from challenges due to occlusion from leaves under the canopy. Our system addresses this challenge by fusing IMU and LiDAR measurements using an Extended Kalman Filter framework on low-cost hardwware. In addition, a local goal generator is introduced to provide locally optimal reference trajectories to the onboard controller. Our system is validated extensively in real-world field environments over a distance of 50.88~km on multiple robots in different field conditions across different locations. We report state-of-the-art distance between intervention results, showing that our system is able to safely navigate without interventions for 386.9~m on average in fields without significant gaps in the crop rows, 56.1~m in production fields and 47.5~m in fields with gaps (space of 1~m without plants in both sides of the row).
LGJun 5, 2021
Forced Variational Integrator Networks for Prediction and Control of Mechanical SystemsAaron Havens, Girish Chowdhary
As deep learning becomes more prevalent for prediction and control of real physical systems, it is important that these overparameterized models are consistent with physically plausible dynamics. This elicits a problem with how much inductive bias to impose on the model through known physical parameters and principles to reduce complexity of the learning problem to give us more reliable predictions. Recent work employs discrete variational integrators parameterized as a neural network architecture to learn conservative Lagrangian systems. The learned model captures and enforces global energy preserving properties of the system from very few trajectories. However, most real systems are inherently non-conservative and, in practice, we would also like to apply actuation. In this paper we extend this paradigm to account for general forcing (e.g. control input and damping) via discrete d'Alembert's principle which may ultimately be used for control applications. We show that this forced variational integrator networks (FVIN) architecture allows us to accurately account for energy dissipation and external forcing while still capturing the true underlying energy-based passive dynamics. We show that in application this can result in highly-data efficient model-based control and can predict on real non-conservative systems.
ROMay 21, 2021
High Throughput Soybean Pod-Counting with In-Field Robotic Data Collection and Machine-Vision Based Data AnalysisMichael McGuire, Chinmay Soman, Brian Diers et al.
We report promising results for high-throughput on-field soybean pod count with small mobile robots and machine-vision algorithms. Our results show that the machine-vision based soybean pod counts are strongly correlated with soybean yield. While pod counts has a strong correlation with soybean yield, pod counting is extremely labor intensive, and has been difficult to automate. Our results establish that an autonomous robot equipped with vision sensors can autonomously collect soybean data at maturity. Machine-vision algorithms can be used to estimate pod-counts across a large diversity panel planted across experimental units (EUs, or plots) in a high-throughput, automated manner. We report a correlation of 0.67 between our automated pod counts and soybean yield. The data was collected in an experiment consisting of 1463 single-row plots maintained by the University of Illinois soybean breeding program during the 2020 growing season. We also report a correlation of 0.88 between automated pod counts and manual pod counts over a smaller data set of 16 plots.
LGMay 10, 2021
Adaptive Policy Transfer in Reinforcement LearningGirish Joshi, Girish Chowdhary
Efficient and robust policy transfer remains a key challenge for reinforcement learning to become viable for real-wold robotics. Policy transfer through warm initialization, imitation, or interacting over a large set of agents with randomized instances, have been commonly applied to solve a variety of Reinforcement Learning tasks. However, this seems far from how skill transfer happens in the biological world: Humans and animals are able to quickly adapt the learned behaviors between similar tasks and learn new skills when presented with new situations. Here we seek to answer the question: Will learning to combine adaptation and exploration lead to a more efficient transfer of policies between domains? We introduce a principled mechanism that can "Adapt-to-Learn", that is adapt the source policy to learn to solve a target task with significant transition differences and uncertainties. We show that the presented method learns to seamlessly combine learning from adaptation and exploration and leads to a robust policy transfer algorithm with significantly reduced sample complexity in transferring skills between related tasks.
ROMar 22, 2021
Multi-agent Aerial Monitoring of Moving Convoys using Elliptical OrbitsAseem Vivek Borkar, Girish Chowdhary
We propose a novel scheme for surveillance of a dynamic ground convoy moving along a non-linear trajectory, by aerial agents that maintain a uniformly spaced formation on a time-varying elliptical orbit encompassing the convoy. Elliptical orbits are used as they are more economical than circular orbits for circumnavigating the group of targets in the moving convoy. The proposed scheme includes an algorithm for computing feasible elliptical orbits, a vector guidance law for agent motion along the desired orbit, and a cooperative strategy to control the speeds of the aerial agents in order to quickly achieve and maintain the desired formation. It achieves mission objectives while accounting for linear and angular speed constraints on the aerial agents. The scheme is validated through simulations and actual experiments with a convoy of ground robots and a team of quadrotors as the aerial agents, in a motion capture environment.
ROMar 21, 2021
High Precision Control of Tracked Field Robots in the Presence of Unknown Traction CoefficientsErkan Kayacan, Sierra N. Young, Joshua M. Peschel et al.
Accurate steering through crop rows that avoids crop damage is one of the most important tasks for agricultural robots utilized in various field operations, such as monitoring, mechanical weeding, or spraying. In practice, varying soil conditions can result in off-track navigation due to unknown traction coefficients so that it can cause crop damage. To address this problem, this paper presents the development, application, and experimental results of a real-time receding horizon estimation and control (RHEC) framework applied to a fully autonomous mobile robotic platform to increase its steering accuracy. Recent advances in cheap and fast microprocessors, as well as advances in solution methods for nonlinear optimization problems, have made nonlinear receding horizon control (RHC) and receding horizon estimation (RHE) methods suitable for field robots that require high frequency (milliseconds) updates. A real-time RHEC framework is developed and applied to a fully autonomous mobile robotic platform designed by the authors for in-field phenotyping applications in Sorghum fields. Nonlinear RHE is used to estimate constrained states and parameters, and nonlinear RHC is designed based on an adaptive system model which contains time-varying parameters. The capabilities of the real-time RHEC framework are verified experimentally, and the results show an accurate tracking performance on a bumpy and wet soil field. The mean values of the Euclidean error and required computation time of the RHEC framework are respectively equal to $0.0423$ m and $0.88$ milliseconds.
ROMar 21, 2021
Tracking error learning control for precise mobile robot path tracking in outdoor environmentErkan Kayacan, Girish Chowdhary
This paper presents a Tracking-Error Learning Control (TELC) algorithm for precise mobile robot path tracking in off-road terrain. In traditional tracking error-based control approaches, feedback and feedforward controllers are designed based on the nominal model which cannot capture the uncertainties, disturbances and changing working conditions so that they cannot ensure precise path tracking performance in the outdoor environment. In TELC algorithm, the feedforward control actions are updated by using the tracking error dynamics and the plant-model mismatch problem is thus discarded. Therefore, the feedforward controller gradually eliminates the feedback controller from the control of the system once the mobile robot has been on-track. In addition to the proof of the stability, it is proven that the cost functions do not have local minima so that the coefficients in TELC algorithm guarantee that the global minimum is reached. The experimental results show that the TELC algorithm results in better path tracking performance than the traditional tracking error-based control method. The mobile robot controlled by TELC algorithm can track a target path precisely with less than $10$ cm error in off-road terrain.
ROMar 21, 2021
High precision control and deep learning-based corn stand counting algorithms for agricultural robotZhongzhong Zhang, Erkan Kayacan, Benjamin Thompson et al.
This paper presents high precision control and deep learning-based corn stand counting algorithms for a low-cost, ultra-compact 3D printed and autonomous field robot for agricultural operations. Currently, plant traits, such as emergence rate, biomass, vigor, and stand counting, are measured manually. This is highly labor-intensive and prone to errors. The robot, termed TerraSentia, is designed to automate the measurement of plant traits for efficient phenotyping as an alternative to manual measurements. In this paper, we formulate a Nonlinear Moving Horizon Estimator (NMHE) that identifies key terrain parameters using onboard robot sensors and a learning-based Nonlinear Model Predictive Control (NMPC) that ensures high precision path tracking in the presence of unknown wheel-terrain interaction. Moreover, we develop a machine vision algorithm designed to enable an ultra-compact ground robot to count corn stands by driving through the fields autonomously. The algorithm leverages a deep network to detect corn plants in images, and a visual tracking model to re-identify detected objects at different time steps. We collected data from 53 corn plots in various fields for corn plants around 14 days after emergence (stage V3 - V4). The robot predictions have agreed well with the ground truth with $C_{robot}=1.02 \times C_{human}-0.86$ and a correlation coefficient $R=0.96$. The mean relative error given by the algorithm is $-3.78\%$, and the standard deviation is $6.76\%$. These results indicate a first and significant step towards autonomous robot-based real-time phenotyping using low-cost, ultra-compact ground robots for corn and potentially other crops.
RODec 15, 2020
Multi-Modal Anomaly Detection for Unstructured and Uncertain EnvironmentsTianchen Ji, Sri Theja Vuppala, Girish Chowdhary et al.
To achieve high-levels of autonomy, modern robots require the ability to detect and recover from anomalies and failures with minimal human supervision. Multi-modal sensor signals could provide more information for such anomaly detection tasks; however, the fusion of high-dimensional and heterogeneous sensor modalities remains a challenging problem. We propose a deep learning neural network: supervised variational autoencoder (SVAE), for failure identification in unstructured and uncertain environments. Our model leverages the representational power of VAE to extract robust features from high-dimensional inputs for supervised learning tasks. The training objective unifies the generative model and the discriminative model, thus making the learning a one-stage procedure. Our experiments on real field robot data demonstrate superior failure identification performance than baseline methods, and that our model learns interpretable representations. Videos of our results are available on our website: https://sites.google.com/illinois.edu/supervised-vae .
RONov 4, 2020
Asynchronous Deep Model Reference Adaptive ControlGirish Joshi, Jasvir Virdi, Girish Chowdhary
In this paper, we present Asynchronous implementation of Deep Neural Network-based Model Reference Adaptive Control (DMRAC). We evaluate this new neuro-adaptive control architecture through flight tests on a small quadcopter. We demonstrate that a single DMRAC controller can handle significant nonlinearities due to severe system faults and deliberate wind disturbances while executing high-bandwidth attitude control. We also show that the architecture has long-term learning abilities across different flight regimes, and can generalize to fly different flight trajectories than those on which it was trained. These results demonstrating the efficacy of this architecture for high bandwidth closed-loop attitude control of unstable and nonlinear robots operating in adverse situations. To achieve these results, we designed a software+communication architecture to ensure online real-time inference of the deep network on a high-bandwidth computation-limited platform. We expect that this architecture will benefit other deep learning in the closed-loop experiments on robots.
SYApr 13, 2020
Energy Shaping Control of a CyberOctopus Soft ArmHeng-Sheng Chang, Udit Halder, Chia-Hsien Shih et al.
This paper entails application of the energy shaping methodology to control a flexible, elastic Cosserat rod model. Recent interest in such continuum models stems from applications in soft robotics, and from the growing recognition of the role of mechanics and embodiment in biological control strategies: octopuses are often regarded as iconic examples of this interplay. Here, the dynamics of the Cosserat rod, modeling a single octopus arm, are treated as a Hamiltonian system and the internal muscle actuators are modeled as distributed forces and couples. The proposed energy shaping control design procedure involves two steps: (1) a potential energy is designed such that its minimizer is the desired equilibrium configuration; (2) an energy shaping control law is implemented to reach the desired equilibrium. By interpreting the controlled Hamiltonian as a Lyapunov function, asymptotic stability of the equilibrium configuration is deduced. The energy shaping control law is shown to require only the deformations of the equilibrium configuration. A forward-backward algorithm is proposed to compute these deformations in an online iterative manner. The overall control design methodology is implemented and demonstrated in a dynamic simulation environment. Results of several bio-inspired numerical experiments involving the control of octopus arms are reported.
LGSep 18, 2019
Deep Model Reference Adaptive ControlGirish Joshi, Girish Chowdhary
We present a new neuroadaptive architecture: Deep Neural Network based Model Reference Adaptive Control (DMRAC). Our architecture utilizes the power of deep neural network representations for modeling significant nonlinearities while marrying it with the boundedness guarantees that characterize MRAC based controllers. We demonstrate through simulations and analysis that DMRAC can subsume previously studied learning based MRAC methods, such as concurrent learning and GP-MRAC. This makes DMRAC a highly powerful architecture for high-performance control of nonlinear systems with long-term learning properties.