Lantao Liu

h-index24

37papers

460citations

Novelty52%

AI Score56

Ranked #6,752 of 194,257 authors (top 3%)#134 in RO (top 2%)

37 Papers

3.6ROJul 8, 2023Code

GP-guided MPPI for Efficient Navigation in Complex Unknown Cluttered Environments

Ihab S. Mohamed, Mahmoud Ali, Lantao Liu

Robotic navigation in unknown, cluttered environments with limited sensing capabilities poses significant challenges in robotics. Local trajectory optimization methods, such as Model Predictive Path Intergal (MPPI), are a promising solution to this challenge. However, global guidance is required to ensure effective navigation, especially when encountering challenging environmental conditions or navigating beyond the planning horizon. This study presents the GP-MPPI, an online learning-based control strategy that integrates MPPI with a local perception model based on Sparse Gaussian Process (SGP). The key idea is to leverage the learning capability of SGP to construct a variance (uncertainty) surface, which enables the robot to learn about the navigable space surrounding it, identify a set of suggested subgoals, and ultimately recommend the optimal subgoal that minimizes a predefined cost function to the local MPPI planner. Afterward, MPPI computes the optimal control sequence that satisfies the robot and collision avoidance constraints. Such an approach eliminates the necessity of a global map of the environment or an offline training process. We validate the efficiency and robustness of our proposed control strategy through both simulated and real-world experiments of 2D autonomous navigation tasks in complex unknown environments, demonstrating its superiority in guiding the robot safely towards its desired goal while avoiding obstacles and escaping entrapment in local minima. The GPU implementation of GP-MPPI, including the supplementary video, is available at https://github.com/IhabMohamed/GP-MPPI.

8.4CVMar 5, 2023

IDA: Informed Domain Adaptive Semantic Segmentation

Zheng Chen, Zhengming Ding, Jason M. Gregory et al.

Mixup-based data augmentation has been validated to be a critical stage in the self-training framework for unsupervised domain adaptive semantic segmentation (UDA-SS), which aims to transfer knowledge from a well-annotated (source) domain to an unlabeled (target) domain. Existing self-training methods usually adopt the popular region-based mixup techniques with a random sampling strategy, which unfortunately ignores the dynamic evolution of different semantics across various domains as training proceeds. To improve the UDA-SS performance, we propose an Informed Domain Adaptation (IDA) model, a self-training framework that mixes the data based on class-level segmentation performance, which aims to emphasize small-region semantics during mixup. In our IDA model, the class-level performance is tracked by an expected confidence score (ECS). We then use a dynamic schedule to determine the mixing ratio for data in different domains. Extensive experimental results reveal that our proposed method is able to outperform the state-of-the-art UDA-SS method by a margin of 1.1 mIoU in the adaptation of GTA-V to Cityscapes and of 0.9 mIoU in the adaptation of SYNTHIA to Cityscapes.

2.8CVMar 5, 2023

SePaint: Semantic Map Inpainting via Multinomial Diffusion

Zheng Chen, Deepak Duggirala, David Crandall et al.

Prediction beyond partial observations is crucial for robots to navigate in unknown environments because it can provide extra information regarding the surroundings beyond the current sensing range or resolution. In this work, we consider the inpainting of semantic Bird's-Eye-View maps. We propose SePaint, an inpainting model for semantic data based on generative multinomial diffusion. To maintain semantic consistency, we need to condition the prediction for the missing regions on the known regions. We propose a novel and efficient condition strategy, Look-Back Condition (LB-Con), which performs one-step look-back operations during the reverse diffusion process. By doing so, we are able to strengthen the harmonization between unknown and known parts, leading to better completion performance. We have conducted extensive experiments on different datasets, showing our proposed model outperforms commonly used interpolation methods in various robotic applications.

4.0ROOct 17, 2022

Decision-Making Among Bounded Rational Agents

Junhong Xu, Durgakant Pushp, Kai Yin et al.

When robots share the same workspace with other intelligent agents (e.g., other robots or humans), they must be able to reason about the behaviors of their neighboring agents while accomplishing the designated tasks. In practice, frequently, agents do not exhibit absolutely rational behavior due to their limited computational resources. Thus, predicting the optimal agent behaviors is undesirable (because it demands prohibitive computational resources) and undesirable (because the prediction may be wrong). Motivated by this observation, we remove the assumption of perfectly rational agents and propose incorporating the concept of bounded rationality from an information-theoretic view into the game-theoretic framework. This allows the robots to reason other agents' sub-optimal behaviors and act accordingly under their computational constraints. Specifically, bounded rationality directly models the agent's information processing ability, which is represented as the KL-divergence between nominal and optimized stochastic policies, and the solution to the bounded-optimal policy can be obtained by an efficient importance sampling approach. Using both simulated and real-world experiments in multi-robot navigation tasks, we demonstrate that the resulting framework allows the robots to reason about different levels of rational behaviors of other agents and compute a reasonable strategy under its computational constraint.

1.5CVJun 26, 2023

Pseudo-Trilateral Adversarial Training for Domain Adaptive Traversability Prediction

Zheng Chen, Durgakant Pushp, Jason M. Gregory et al.

Traversability prediction is a fundamental perception capability for autonomous navigation. Deep neural networks (DNNs) have been widely used to predict traversability during the last decade. The performance of DNNs is significantly boosted by exploiting a large amount of data. However, the diversity of data in different domains imposes significant gaps in the prediction performance. In this work, we make efforts to reduce the gaps by proposing a novel pseudo-trilateral adversarial model that adopts a coarse-to-fine alignment (CALI) to perform unsupervised domain adaptation (UDA). Our aim is to transfer the perception model with high data efficiency, eliminate the prohibitively expensive data labeling, and improve the generalization capability during the adaptation from easy-to-access source domains to various challenging target domains. Existing UDA methods usually adopt a bilateral zero-sum game structure. We prove that our CALI model -- a pseudo-trilateral game structure is advantageous over existing bilateral game structures. This proposed work bridges theoretical analyses and algorithm designs, leading to an efficient UDA model with easy and stable training. We further develop a variant of CALI -- Informed CALI (ICALI), which is inspired by the recent success of mixup data augmentation techniques and mixes informative regions based on the results of CALI. This mixture step provides an explicit bridging between the two domains and exposes underperforming classes more during training. We show the superiorities of our proposed models over multiple baselines in several challenging domain adaptation setups. To further validate the effectiveness of our proposed models, we then combine our perception model with a visual planner to build a navigation system and show the high reliability of our model in complex natural environments.

6.7ROMay 12

Adaptive Smooth Tchebycheff Attention for Multi-Objective Policy Optimization

Alejandro Murillo-Gonzalez, Mahmoud Ali, Lantao Liu

Multi-objective reinforcement learning in robotic domains requires balancing complex, non-convex trade-offs between conflicting objectives. While linear scalarization methods provide stability, they are theoretically incapable of recovering solutions within non-convex regions of the Pareto front. Conversely, static non-linear scalarizations (e.g., Tchebycheff) can theoretically access these regions but often suffer from severe gradient variance and optimization instability in deep RL. In this work, we propose an Adaptive Smooth Tchebycheff framework that resolves this tension by dynamically modulating the curvature of the optimization landscape. We introduce a novel conflict-driven controller that regulates the optimization smoothness based on real-time gradient interference. This allows the agent to anneal toward precise, non-convex scalarization when objectives align, while elastically reverting to stable, smooth approximations when destructive gradient conflicts emerge. We validate our approach on a challenging robotic stealth visual search task -- a proxy for monitoring of protected/fragile ecosystems -- where an agent must balance search, exposure/interference minimization and exploration speed. Extensive ablations confirm that our conflict-aware adaptation enables the robust discovery of Pareto-optimal policies in non-convex regions inaccessible to linear baselines and unstable for static non-linear methods. Website: https://alejandromllo.github.io/research/pasta/

7.1ROMay 12

Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

Youwei Yu, Jionghao Wang, Zhengming Yu et al.

Designing learnable information-theoretic objectives for robot exploration remains challenging. Such objectives aim to guide exploration toward data that reduces uncertainty in model parameters, yet it is often unclear what information the collected data can actually reveal. Although reinforcement learning (RL) can optimize a given objective, constructing objectives that reflect parametric learnability is difficult in high-dimensional robotic systems. Many parameter directions are weakly observable or unidentifiable, and even when identifiable directions are selected, omitted directions can still influence exploration and distort information measures. To address this challenge, we propose Quasi-Optimal Experimental Design (Q{\footnotesize OED}), an adaptive information objective grounded in optimal experimental design. Q{\footnotesize OED} (i) performs eigenspace analysis of the Fisher information matrix to identify an observable subspace and select identifiable parameter directions, and (ii) modifies the exploration objective to emphasize these directions while suppressing nuisance effects from non-critical parameters. Under bounded nuisance influence and limited coupling between critical and nuisance directions, Q{\footnotesize OED} provides a constant-factor approximation to the ideal information objective that explores all parameters. We evaluate Q{\footnotesize OED} on simulated and real-world navigation and manipulation tasks, where identifiable-direction selection and nuisance suppression yield performance improvements of \SI{35.23}{\percent} and \SI{21.98}{\percent}, respectively. When integrated as an exploration objective in model-based policy optimization, Q{\footnotesize OED} further improves policy performance over established RL baselines.

5.2ROMar 14

LPV-MPC for Lateral Control in Full-Scale Autonomous Racing

Hassan Jardali, Ihab S. Mohamed, Durgakant Pushp et al.

Autonomous racing has attracted significant attention recently, presenting challenges in selecting an optimal controller that operates within the onboard system's computational limits and meets operational constraints such as limited track time and high costs. This paper introduces a Linear Parameter-Varying Model Predictive Controller (LPV-MPC) for lateral control. Implemented on an IAC AV-24, the controller achieved stable performance at speeds exceeding 160 mph (71.5 m/s). We detail the controller design, the methodology for extracting model parameters, and key system-level and implementation considerations. Additionally, we report results from our final race run, providing a comprehensive analysis of both vehicle dynamics and controller performance. A Python implementation of the framework is available at: https://tinyurl.com/LPV-MPC-acados

8.4CVJul 23, 2025Code

AFRDA: Attentive Feature Refinement for Domain Adaptive Semantic Segmentation

Md. Al-Masrur Khan, Durgakant Pushp, Lantao Liu

In Unsupervised Domain Adaptive Semantic Segmentation (UDA-SS), a model is trained on labeled source domain data (e.g., synthetic images) and adapted to an unlabeled target domain (e.g., real-world images) without access to target annotations. Existing UDA-SS methods often struggle to balance fine-grained local details with global contextual information, leading to segmentation errors in complex regions. To address this, we introduce the Adaptive Feature Refinement (AFR) module, which enhances segmentation accuracy by refining highresolution features using semantic priors from low-resolution logits. AFR also integrates high-frequency components, which capture fine-grained structures and provide crucial boundary information, improving object delineation. Additionally, AFR adaptively balances local and global information through uncertaintydriven attention, reducing misclassifications. Its lightweight design allows seamless integration into HRDA-based UDA methods, leading to state-of-the-art segmentation performance. Our approach improves existing UDA-SS methods by 1.05% mIoU on GTA V --> Cityscapes and 1.04% mIoU on Synthia-->Cityscapes. The implementation of our framework is available at: https://github.com/Masrur02/AFRDA

9.5ROApr 25, 2025Code

Action Flow Matching for Continual Robot Learning

Alejandro Murillo-Gonzalez, Lantao Liu

Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks, mirroring human adaptability. A key challenge is refining dynamics models, essential for planning and control, while addressing issues such as safe adaptation, catastrophic forgetting, outlier management, data efficiency, and balancing exploration with exploitation -- all within task and onboard resource constraints. Towards this goal, we introduce a generative framework leveraging flow matching for online robot dynamics model alignment. Rather than executing actions based on a misaligned model, our approach refines planned actions to better match with those the robot would take if its model was well aligned. We find that by transforming the actions themselves rather than exploring with a misaligned model -- as is traditionally done -- the robot collects informative data more efficiently, thereby accelerating learning. Moreover, we validate that the method can handle an evolving and possibly imperfect model while reducing, if desired, the dependency on replay buffers or legacy model snapshots. We validate our approach using two platforms: an unmanned ground vehicle and a quadrotor. The results highlight the method's adaptability and efficiency, with a record 34.2\% higher task success rate, demonstrating its potential towards enabling continual robot learning. Code: https://github.com/AlejandroMllo/action_flow_matching.

9.6CVDec 30, 2023

PlanarNeRF: Online Learning of Planar Primitives with Neural Radiance Fields

Zheng Chen, Qingan Yan, Huangying Zhan et al.

Identifying spatially complete planar primitives from visual data is a crucial task in computer vision. Prior methods are largely restricted to either 2D segment recovery or simplifying 3D structures, even with extensive plane annotations. We present PlanarNeRF, a novel framework capable of detecting dense 3D planes through online learning. Drawing upon the neural field representation, PlanarNeRF brings three major contributions. First, it enhances 3D plane detection with concurrent appearance and geometry knowledge. Second, a lightweight plane fitting module is proposed to estimate plane parameters. Third, a novel global memory bank structure with an update mechanism is introduced, ensuring consistent cross-frame correspondence. The flexible architecture of PlanarNeRF allows it to function in both 2D-supervised and self-supervised solutions, in each of which it can effectively learn from sparse training signals, significantly improving training efficiency. Through extensive experiments, we demonstrate the effectiveness of PlanarNeRF in various scenarios and remarkable improvement over existing works.

7.8ROAug 8, 2025

Learning Causal Structure Distributions for Robust Planning

Alejandro Murillo-Gonzalez, Junhong Xu, Lantao Liu

Structural causal models describe how the components of a robotic system interact. They provide both structural and functional information about the relationships that are present in the system. The structural information outlines the variables among which there is interaction. The functional information describes how such interactions work, via equations or learned models. In this paper we find that learning the functional relationships while accounting for the uncertainty about the structural information leads to more robust dynamics models which improves downstream planning, while using significantly lower computational resources. This in contrast with common model-learning methods that ignore the causal structure and fail to leverage the sparsity of interactions in robotic systems. We achieve this by estimating a causal structure distribution that is used to sample causal graphs that inform the latent-space representations in an encoder-multidecoder probabilistic model. We show that our model can be used to learn the dynamics of a robot, which together with a sampling-based planner can be used to perform new tasks in novel environments, provided an objective function for the new requirement is available. We validate our method using manipulators and mobile robots in both simulation and the real-world. Additionally, we validate the learned dynamics' adaptability and increased robustness to corrupted inputs and changes in the environment, which is highly desirable in challenging real-world robotics scenarios. Video: https://youtu.be/X6k5t7OOnNc.

5.7ROMay 26, 2025

Situationally-Aware Dynamics Learning

Alejandro Murillo-Gonzalez, Lantao Liu

Autonomous robots operating in complex, unstructured environments face significant challenges due to latent, unobserved factors that obscure their understanding of both their internal state and the external world. Addressing this challenge would enable robots to develop a more profound grasp of their operational context. To tackle this, we propose a novel framework for online learning of hidden state representations, with which the robots can adapt in real-time to uncertain and dynamic conditions that would otherwise be ambiguous and result in suboptimal or erroneous behaviors. Our approach is formalized as a Generalized Hidden Parameter Markov Decision Process, which explicitly models the influence of unobserved parameters on both transition dynamics and reward structures. Our core innovation lies in learning online the joint distribution of state transitions, which serves as an expressive representation of latent ego- and environmental-factors. This probabilistic approach supports the identification and adaptation to different operational situations, improving robustness and safety. Through a multivariate extension of Bayesian Online Changepoint Detection, our method segments changes in the underlying data generating process governing the robot's dynamics. The robot's transition model is then informed with a symbolic representation of the current situation derived from the joint distribution of latest state transitions, enabling adaptive and context-aware decision-making. To showcase the real-world effectiveness, we validate our approach in the challenging task of unstructured terrain navigation, where unmodeled and unmeasured terrain characteristics can significantly impact the robot's motion. Extensive experiments in both simulation and real world reveal significant improvements in data efficiency, policy performance, and the emergence of safer, adaptive navigation strategies.

5.3RONov 16, 2021

Kernel-based diffusion approximated Markov decision processes for autonomous navigation and control on unstructured terrains

Junhong Xu, Kai Yin, Zheng Chen et al.

We propose a diffusion approximation method to the continuous-state Markov Decision Processes (MDPs) that can be utilized to address autonomous navigation and control in unstructured off-road environments. In contrast to most decision-theoretic planning frameworks that assume fully known state transition models, we design a method that eliminates such a strong assumption that is often extremely difficult to engineer in reality. We first take the second-order Taylor expansion of the value function. The Bellman optimality equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of the value function, we design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations characterized by a finite set of supporting states. We first validate the proposed method through extensive simulations in 2D obstacle avoidance and 2.5D terrain navigation problems. The results show that the proposed approach leads to a much superior performance over several baselines. We then develop a system that integrates our decision-making framework with onboard perception and conduct real-world experiments in both cluttered indoor and unstructured outdoor environments. The results from the physical systems further demonstrate the applicability of our method in challenging real-world environments.

2.3SYNov 5, 2021

Artificial Neural Network-Based Voltage Control of DC/DC Converter for DC Microgrid Applications

Hussain Sarwar Khan, Ihab S. Mohamed, Kimmo Kauhaniemi et al.

The rapid growth of renewable energy technology enables the concept of microgrid (MG) to be widely accepted in the power systems. Due to the advantages of the DC distribution system such as easy integration of energy storage and less system loss, DC MG attracts significant attention nowadays. The linear controller such as PI or PID is matured and extensively used by the power electronics industry, but their performance is not optimal as system parameters are changed. In this study, an artificial neural network (ANN) based voltage control strategy is proposed for the DC-DC boost converter. In this paper, the model predictive control (MPC) is used as an expert, which provides the data to train the proposed ANN. As ANN is tuned finely, then it is utilized directly to control the step-up DC converter. The main advantage of the ANN is that the neural network system identification decreases the inaccuracy of the system model even with inaccurate parameters and has less computational burden compared to MPC due to its parallel structure. To validate the performance of the proposed ANN, extensive MATLAB/Simulink simulations are carried out. The simulation results show that the ANN-based control strategy has better performance under different loading conditions comparison to the PI controller. The accuracy of the trained ANN model is about 97%, which makes it suitable to be used for DC microgrid applications.

20.3RONov 2, 2021Code

Pareto Monte Carlo Tree Search for Multi-Objective Informative Planning

Weizhe Chen, Lantao Liu

In many environmental monitoring scenarios, the sampling robot needs to simultaneously explore the environment and exploit features of interest with limited time. We present an anytime multi-objective informative planning method called Pareto Monte Carlo tree search which allows the robot to handle potentially competing objectives such as exploration versus exploitation. The method produces optimized decision solutions for the robot based on its knowledge (estimation) of the environment state, leading to better adaptation to environmental dynamics. We provide algorithmic analysis on the critical tree node selection step and show that the number of times choosing sub-optimal nodes is logarithmically bounded and the search result converges to the optimal choices at a polynomial rate.

3.0RONov 2, 2021

Informative Planning in the Presence of Outliers

Weizhe Chen, Lantao Liu

Informative planning seeks a sequence of actions that guide the robot to collect the most informative data to build a large-scale environmental model or learn a dynamical system. Existing work in informative planning mainly focuses on proposing new planners and applying them to various robotic applications such as environmental monitoring, autonomous exploration, and system identification. The informative planners optimize an objective given by a probabilistic model, e.g., Gaussian process regression (GPR). In practice, the ubiquitous sensing outliers can easily affect the model, resulting in a misleading objective. A straightforward solution is to filter out the outliers in the sensing data stream using an off-the-shelf outlier detector. However, informative samples are also scarce by definition so they might be falsely filtered out. In this paper, we propose a method to enable the robot to re-visit the locations where outliers were sampled besides optimizing the informative planning objective. The robot can collect more samples in the vicinity of outliers and update the outlier detector to reduce the number of false alarms. We achieve this by designing a new objective for the Pareto Monte Carlo tree search (MCTS). We demonstrate that the proposed framework performs better than applying an outlier detector naively.

3.0ROOct 29, 2021

NSS-VAEs: Generative Scene Decomposition for Visual Navigable Space Construction

Zheng Chen, Lantao Liu

Detecting navigable space is the first and also a critical step for successful robot navigation. In this work, we treat the visual navigable space segmentation as a scene decomposition problem and propose a new network, NSS-VAEs (Navigable Space Segmentation Variational AutoEncoders), a representation-learning-based framework to enable robots to learn the navigable space segmentation in an unsupervised manner. Different from prevalent segmentation techniques which heavily rely on supervised learning strategies and typically demand immense pixel-level annotated images, the proposed framework leverages a generative model - Variational Auto-Encoder (VAE) - to learn a probabilistic polyline representation that compactly outlines the desired navigable space boundary. Uniquely, our method also assesses the prediction uncertainty related to the unstructuredness of the scenes, which is important for robot navigation in unstructured environments. Through extensive experiments, we have validated that our proposed method can achieve remarkably high accuracy (>90%) even without a single label. We also show that the prediction of NSS-VAEs can be further improved using few labels with results significantly outperforming the SOTA fully supervised-learning-based method.

3.0ROOct 29, 2021

Efficient Map Prediction via Low-Rank Matrix Completion

Zheng Chen, Shi Bai, Lantao Liu

In many autonomous mapping tasks, the maps cannot be accurately constructed due to various reasons such as sparse, noisy, and partial sensor measurements. We propose a novel map prediction method built upon the recent success of Low-Rank Matrix Completion. The proposed map prediction is able to achieve both map interpolation and extrapolation on raw poor-quality maps with missing or noisy observations. We validate with extensive simulated experiments that the approach can achieve real-time computation for large maps, and the performance is superior to the state-of-the-art map prediction approach - Bayesian Hilbert Mapping in terms of mapping accuracy and computation time. Then we demonstrate that with the proposed real-time map prediction framework, the coverage convergence rate (per action step) for a set of representative coverage planning methods commonly used for environmental modeling and monitoring tasks can be significantly improved.

3.0ROOct 29, 2021

Multi-Objective Autonomous Exploration on Real-Time Continuous Occupancy Maps

Zheng Chen, Weizhe Chen, Shi Bai et al.

Autonomous exploration in unknown environments using mobile robots is the pillar of many robotic applications. Existing exploration frameworks either select the nearest geometric frontier or the nearest information-theoretic frontier. However, just because a frontier itself is informative does not necessarily mean that the robot will be in an informative area after reaching that frontier. To fill this gap, we propose to use a multi-objective variant of Monte-Carlo tree search that provides a non-myopic Pareto optimal action sequence leading the robot to a frontier with the greatest extent of unknown area uncovering. We also adopted Bayesian Hilbert Map (BHM) for continuous occupancy mapping and made it more applicable to real-time tasks.

1.4CVOct 29, 2021

Polyline Generative Navigable Space Segmentation for Autonomous Visual Navigation

Zheng Chen, Zhengming Ding, David Crandall et al.

Detecting navigable space is a fundamental capability for mobile robots navigating in unknown or unmapped environments. In this work, we treat visual navigable space segmentation as a scene decomposition problem and propose Polyline Segmentation Variational autoencoder Network (PSV-Net), a representation learning-based framework for learning the navigable space segmentation in a self-supervised manner. Current segmentation techniques heavily rely on fully-supervised learning strategies which demand a large amount of pixel-level annotated images. In this work, we propose a framework leveraging a Variational AutoEncoder (VAE) and an AutoEncoder (AE) to learn a polyline representation that compactly outlines the desired navigable space boundary. Through extensive experiments, we validate that the proposed PSV-Net can learn the visual navigable space with no or few labels, producing an accuracy comparable to fully-supervised state-of-the-art methods that use all available labels. In addition, we show that integrating the proposed navigable space segmentation model with a visual planner can achieve efficient mapless navigation in real environments.

2.3SYOct 15, 2021Code

An Artificial Neural Network-Based Model Predictive Control for Three-phase Flying Capacitor Multi-Level Inverter

Abualkasim Bakeer, Ihab S. Mohamed, Parisa Boodaghi Malidarreh et al.

Model predictive control (MPC) has been used widely in power electronics due to its simple concept, fast dynamic response, and good reference tracking. However, it suffers from parametric uncertainties, since it directly relies on the mathematical model of the system to predict the optimal switching states to be used at the next sampling time. As a result, uncertain parameters lead to an ill-designed MPC. Thus, this paper offers a model-free control strategy on the basis of artificial neural networks (ANNs), for mitigating the effects of parameter mismatching while having a little negative impact on the inverter's performance. This method includes two related stages. First, MPC is used as an expert to control the studied converter in order to provide a dataset, while, in the second stage, the obtained dataset is utilized to train the proposed ANN. The case study herein is based on a four-level three-cell flying capacitor inverter. In this study, MATLAB/Simulink is used to simulate the performance of the proposed method, taking into account various operating conditions. Afterward, the simulation results are reported in comparison with the conventional MPC scheme, demonstrating the superior performance of the proposed control strategy in terms of robustness against parameters mismatch and low total harmonic distortion (THD), especially when changes occur in the system parameters, compared to the conventional MPC. Furthermore, the experimental validation of the proposed method is provided based on the Hardware-in-the-Loop (HIL) simulation using the C2000TM-microcontroller-LaunchPadXL TMS320F28379D kit, demonstrating the applicability of the ANN-based control strategy to be implemented on a DSP controller.

4.1ROOct 13, 2020

Swarming of Aerial Robots with Markov Random Field Optimization

Malintha Fernando, Lantao Liu

Swarms are highly robust systems that offer unique benefits compared to their alternatives. In this work, we propose a bio-inspired and artificial potential field-driven robot swarm control method, where the swarm formation dynamics are modeled on the basis of Markov Random Field (MRF) optimization. We integrate the internal agent-wise local interactions and external environmental influences into the MRF. The optimized formation configurations at different stages of the trajectory can be viewed as formation "shapes" which further allows us to integrate dynamics-constrained motion control of the robots. We show that this approach can be used to generate dynamically feasible trajectories to navigate teams of aerial robots in complex environments.

2.2ROSep 8, 2020

Online Planning in Uncertain and Dynamic Environment in the Presence of Multiple Mobile Vehicles

Junhong Xu, Kai Yin, Lantao Liu

We investigate the autonomous navigation of a mobile robot in the presence of other moving vehicles under time-varying uncertain environmental disturbances. We first predict the future state distributions of other vehicles to account for their uncertain behaviors affected by the time-varying disturbances. We then construct a dynamic-obstacle-aware reachable space that contains states with high probabilities to be reached by the robot, within which the optimal policy is searched. Since, in general, the dynamics of both the vehicle and the environmental disturbances are nonlinear, we utilize a nonlinear Gaussian filter -- the unscented transform -- to approximate the future state distributions. Finally, the forward reachable space computation and backward policy search are iterated until convergence. Extensive simulation evaluations have revealed significant advantages of this proposed method in terms of computation time, decision accuracy, and planning reliability.

5.7ROJun 3, 2020

Kernel Taylor-Based Value Function Approximation for Continuous-State Markov Decision Processes

Junhong Xu, Kai Yin, Lantao Liu

We propose a principled kernel-based policy iteration algorithm to solve the continuous-state Markov Decision Processes (MDPs). In contrast to most decision-theoretic planning frameworks, which assume fully known state transition models, we design a method that eliminates such a strong assumption, which is oftentimes extremely difficult to engineer in reality. To achieve this, we first apply the second-order Taylor expansion of the value function. The Bellman optimality equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of value function, we then design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations characterized by a finite set of supporting states. We have validated the proposed method through extensive simulations in both simplified and realistic planning scenarios, and the experiments show that our proposed approach leads to a much superior performance over several baseline methods.

7.3ROMay 22, 2019

Reachable Space Characterization of Markov Decision Processes with Time Variability

Junhong Xu, Kai Yin, Lantao Liu

We propose a solution to a time-varying variant of Markov Decision Processes which can be used to address decision-theoretic planning problems for autonomous systems operating in unstructured outdoor environments. We explore the time variability property of the planning stochasticity and investigate the state reachability, based on which we then develop an efficient iterative method that offers a good trade-off between solution optimality and time complexity. The reachability space is constructed by analyzing the means and variances of states' reaching time in the future. We validate our algorithm through extensive simulations using ocean data, and the results show that our method achieves a great performance in terms of both solution quality and computing time.

4.9ROMar 4, 2019

Creating Navigable Space from Sparse Noisy Map Points

Zheng Chen, Lantao Liu

We present a framework for creating navigable space from sparse and noisy map points generated by sparse visual SLAM methods. Our method incrementally seeds and creates local convex regions free of obstacle points along a robot's trajectory. Then a dense version of point cloud is reconstructed through a map point regulation process where the original noisy map points are first projected onto a series of local convex hull surfaces, after which those points falling inside the convex hulls are culled. The regulated and refined map points allow human users to quickly recognize and abstract the environmental information. We have validated our proposed framework using both a public dataset and a real environmental structure, and our results reveal that the reconstructed navigable free space has small volume loss (error) comparing with the ground truth, and the method is highly efficient, allowing real-time computation and online planning.

4.9ROMar 3, 2019

State-Continuity Approximation of Markov Decision Processes via Finite Element Methods for Autonomous System Planning

Junhong Xu, Kai Yin, Lantao Liu

Motion planning under uncertainty for an autonomous system can be formulated as a Markov Decision Process with a continuous state space. In this paper, we propose a novel solution to this decision-theoretic planning problem that directly obtains the continuous value function with only the first and second moments of the transition probabilities, alleviating the requirement for an explicit transition model in the literature. We achieve this by expressing the value function as a linear combination of basis functions and approximating the Bellman equation by a partial differential equation, where the value function can be naturally constructed using a finite element method. We have validated our approach via extensive simulations, and the evaluations reveal that to baseline methods, our solution leads to in terms of path smoothness, travel distance, and time costs.

22.9LGFeb 15, 2019

AutoQ: Automated Kernel-Wise Neural Network Quantization

Qian Lou, Feng Guo, Lantao Liu et al.

Network quantization is one of the most hardware friendly techniques to enable the deployment of convolutional neural networks (CNNs) on low-power mobile devices. Recent network quantization techniques quantize each weight kernel in a convolutional layer independently for higher inference accuracy, since the weight kernels in a layer exhibit different variances and hence have different amounts of redundancy. The quantization bitwidth or bit number (QBN) directly decides the inference accuracy, latency, energy and hardware overhead. To effectively reduce the redundancy and accelerate CNN inferences, various weight kernels should be quantized with different QBNs. However, prior works use only one QBN to quantize each convolutional layer or the entire CNN, because the design space of searching a QBN for each weight kernel is too large. The hand-crafted heuristic of the kernel-wise QBN search is so sophisticated that domain experts can obtain only sub-optimal results. It is difficult for even deep reinforcement learning (DRL) Deep Deterministic Policy Gradient (DDPG)-based agents to find a kernel-wise QBN configuration that can achieve reasonable inference accuracy. In this paper, we propose a hierarchical-DRL-based kernel-wise network quantization technique, AutoQ, to automatically search a QBN for each weight kernel, and choose another QBN for each activation layer. Compared to the models quantized by the state-of-the-art DRL-based schemes, on average, the same models quantized by AutoQ reduce the inference latency by 54.06\%, and decrease the inference energy consumption by 50.69\%, while achieving the same inference accuracy.

4.1LGJan 4, 2019

Accelerating Goal-Directed Reinforcement Learning by Model Characterization

Shoubhik Debnath, Gaurav Sukhatme, Lantao Liu

We propose a hybrid approach aimed at improving the sample efficiency in goal-directed reinforcement learning. We do this via a two-step mechanism where firstly, we approximate a model from Model-Free reinforcement learning. Then, we leverage this approximate model along with a notion of reachability using Mean First Passage Times to perform Model-Based reinforcement learning. Built on such a novel observation, we design two new algorithms - Mean First Passage Time based Q-Learning (MFPT-Q) and Mean First Passage Time based DYNA (MFPT-DYNA), that have been fundamentally modified from the state-of-the-art reinforcement learning techniques. Preliminary results have shown that our hybrid approaches converge with much fewer iterations than their corresponding state-of-the-art counterparts and therefore requiring much fewer samples and much fewer training trials to converge.

3.6AIJan 4, 2019

Solving Markov Decision Processes with Reachability Characterization from Mean First Passage Times

Shoubhik Debnath, Lantao Liu, Gaurav Sukhatme

A new mechanism for efficiently solving the Markov decision processes (MDPs) is proposed in this paper. We introduce the notion of reachability landscape where we use the Mean First Passage Time (MFPT) as a means to characterize the reachability of every state in the state space. We show that such reachability characterization very well assesses the importance of states and thus provides a natural basis for effectively prioritizing states and approximating policies. Built on such a novel observation, we design two new algorithms -- Mean First Passage Time based Value Iteration (MFPT-VI) and Mean First Passage Time based Policy Iteration (MFPT-PI) -- that have been modified from the state-of-the-art solution methods. To validate our design, we have performed numerical evaluations in robotic decision-making scenarios, by comparing the proposed new methods with corresponding classic baseline mechanisms. The evaluation results showed that MFPT-VI and MFPT-PI have outperformed the state-of-the-art solutions in terms of both practical runtime and number of iterations. Aside from the advantage of fast convergence, this new solution method is intuitively easy to understand and practically simple to implement.

2.0AIJan 3, 2019

Reachability and Differential based Heuristics for Solving Markov Decision Processes

Shoubhik Debnath, Lantao Liu, Gaurav Sukhatme

The solution convergence of Markov Decision Processes (MDPs) can be accelerated by prioritized sweeping of states ranked by their potential impacts to other states. In this paper, we present new heuristics to speed up the solution convergence of MDPs. First, we quantify the level of reachability of every state using the Mean First Passage Time (MFPT) and show that such reachability characterization very well assesses the importance of states which is used for effective state prioritization. Then, we introduce the notion of backup differentials as an extension to the prioritized sweeping mechanism, in order to evaluate the impacts of states at an even finer scale. Finally, we extend the state prioritization to the temporal process, where only partial sweeping can be performed during certain intermediate value iteration stages. To validate our design, we have performed numerical evaluations by comparing the proposed new heuristics with corresponding classic baseline mechanisms. The evaluation results showed that our reachability based framework and its differential variants have outperformed the state-of-the-art solutions in terms of both practical runtime and number of iterations.

1.6ROMar 11, 2018

Learning Partially Structured Environmental Dynamics for Marine Robotic Navigation

Chen Huang, Kai Yin, Lantao Liu

We investigate the scenario that a robot needs to reach a designated goal after taking a sequence of appropriate actions in a non-static environment that is partially structured. One application example is to control a marine vehicle to move in the ocean. The ocean environment is dynamic and oftentimes the ocean waves result in strong disturbances that can disturb the vehicle's motion. Modeling such dynamic environment is non-trivial, and integrating such model in the robotic motion control is particularly difficult. Fortunately, the ocean currents usually form some local patterns (e.g. vortex) and thus the environment is partially structured. The historically observed data can be used to train the robot to learn to interact with the ocean tidal disturbances. In this paper we propose a method that applies the deep reinforcement learning framework to learn such partially structured complex disturbances. Our results show that, by training the robot under artificial and real ocean disturbances, the robot is able to successfully act in complex and spatiotemporal environments.

10.8ROFeb 7, 2017

Data-Driven Learning and Planning for Environmental Sampling

Kai-Chieh Ma, Lantao Liu, Hordur K. Heidarsson et al.

Robots such as autonomous underwater vehicles (AUVs) and autonomous surface vehicles (ASVs) have been used for sensing and monitoring aquatic environments such as oceans and lakes. Environmental sampling is a challenging task because the environmental attributes to be observed can vary both spatially and temporally, and the target environment is usually a large and continuous domain whereas the sampling data is typically sparse and limited. The challenges require that the sampling method must be informative and efficient enough to catch up with the environmental dynamics. In this paper we present a planning and learning method that enables a sampling robot to perform persistent monitoring tasks by learning and refining a dynamic "data map" that models a spatiotemporal environment attribute such as ocean salinity content. Our environmental sampling framework consists of two components: to maximize the information collected, we propose an informative planning component that efficiently generates sampling waypoints that contain the maximal information; To alleviate the computational bottleneck caused by large-scale data accumulated, we develop a component based on a sparse Gaussian Process whose hyperparameters are learned online by taking advantage of only a subset of data that provides the greatest contribution. We validate our method with both simulations running on real ocean data and field trials with an ASV in a lake environment. Our experiments show that the proposed framework is both accurate in learning the environmental data map and efficient in catching up with the dynamic environmental changes.

7.8AINov 24, 2016

A Spatio-Temporal Representation for the Orienteering Problem with Time-Varying Profits

Zhibei Ma, Kai Yin, Lantao Liu et al.

We consider an orienteering problem (OP) where an agent needs to visit a series (possibly a subset) of depots, from which the maximal accumulated profits are desired within given limited time budget. Different from most existing works where the profits are assumed to be static, in this work we investigate a variant that has arbitrary time-dependent profits. Specifically, the profits to be collected change over time and they follow different (e.g., independent) time-varying functions. The problem is of inherent nonlinearity and difficult to solve by existing methods. To tackle the challenge, we present a simple and effective framework that incorporates time-variations into the fundamental planning process. Specifically, we propose a deterministic spatio-temporal representation where both spatial description and temporal logic are unified into one routing topology. By employing existing basic sorting and searching algorithms, the routing solutions can be computed in an extremely efficient way. The proposed method is easy to implement and extensive numerical results show that our approach is time efficient and generates near-optimal solutions.

13.3ROSep 24, 2016

Informative Planning and Online Learning with Sparse Gaussian Processes

Kai-Chieh Ma, Lantao Liu, Gaurav S. Sukhatme

A big challenge in environmental monitoring is the spatiotemporal variation of the phenomena to be observed. To enable persistent sensing and estimation in such a setting, it is beneficial to have a time-varying underlying environmental model. Here we present a planning and learning method that enables an autonomous marine vehicle to perform persistent ocean monitoring tasks by learning and refining an environmental model. To alleviate the computational bottleneck caused by large-scale data accumulated, we propose a framework that iterates between a planning component aimed at collecting the most information-rich data, and a sparse Gaussian Process learning component where the environmental model and hyperparameters are learned online by taking advantage of only a subset of data that provides the greatest contribution. Our simulations with ground-truth ocean data shows that the proposed method is both accurate and efficient.

11.7ROMay 3, 2016

A Solution to Time-Varying Markov Decision Processes

Lantao Liu, Gaurav S. Sukhatme

We consider a decision-making problem where the environment varies both in space and time. Such problems arise naturally when considering e.g., the navigation of an underwater robot amidst ocean currents or the navigation of an aerial vehicle in wind. To model such spatiotemporal variation, we extend the standard Markov Decision Process (MDP) to a new framework called the Time-Varying Markov Decision Process (TVMDP). The TVMDP has a time-varying state transition model and transforms the standard MDP that considers only immediate and static uncertainty descriptions of state transitions, to a framework that is able to adapt to future time-varying transition dynamics over some horizon. We show how to solve a TVMDP via a redesign of the MDP value propagation mechanisms by incorporating the introduced dynamics along the temporal dimension. We validate our framework in a marine robotics navigation setting using spatiotemporal ocean data and show that it outperforms prior efforts.