LGMay 28Code
DisasterLex: An Expert Concept-to-Schema Knowledge Graph for Geospatial Reasoning in Disaster AnalyticsYiming Xiao, Ankit Basu, Kai Yin et al.
Disasters are inevitable and increasingly costly, and effective response depends on querying structured tabular data: precise, information-dense records of hazard, exposure, vulnerability, and lifeline infrastructure that underpin disaster management. Current text-to-SQL methods enable natural-language access to such tables but transfer poorly to the disaster domain, where queries span heterogeneous geospatial schemas and require reasoning over causal relations. We introduce DisasterLex, a knowledge-graph-mediated framework that inserts an Expert Knowledge Graph (EKG) of curated concepts and typed causal edges between the user query and the database, bridged to schema by concept-to-table links. The orchestration runs four stages (identifying query entities, routing to the operational domain, planning over causal edges, and grounding the SQL), restricting the schema passed to the model at each step. We instantiate it on a disaster-analytics database (36 geospatial tables, 150 columns) with an EKG of 107 concepts, 117 causal edges, and 52 concept-to-schema links, evaluated on a 75-query test set. On all seven base models spanning proprietary and open-weight families, DisasterLex beats four state-of-the-art baselines (LightRAG, HippoRAG 2, ReFoRCE, CHESS) by 1.4x to 2.75x, with absolute scores of 1.65 to 3.56 (of 5.0). Error analysis shows baseline failures cluster in routing and multi-table SQL composition, the operations our orchestration explicitly addresses. Code, data, and the EKG artifact are available at https://github.com/YimingXiao98/DisasterLex and on Zenodo at https://doi.org/10.5281/zenodo.20388029.
CLMay 27Code
DisasterBench: Benchmarking LLM Planning under Typed Tool Interface ConstraintsZhitong Chen, Kai Yin, Weifeng Zhang et al.
Disasters cause severe societal impacts, demanding rapid coordination of heterogeneous AI tools, from satellite analysis to flood prediction and damage assessment, into coherent multi-step workflows. As LLMs increasingly serve as orchestrators of such pipelines, effective coordination requires more than selecting semantically plausible tools: LLMs must generate executable workflows with correct parameter binding and dependency propagation. We introduce DisasterBench, a benchmark for evaluating structured multi-agent planning over semantically similar but operationally distinct disaster-response tools. To enable step-level failure attribution, we further propose First-Point-of-Failure (FPoF), which localizes the earliest root cause in a predicted workflow, separating primary errors from downstream cascading effects. Our evaluation reveals three findings: planning method effectiveness depends strongly on model capacity; tool mismatch and parameter-binding errors dominate first failures, revealing semantic grounding and execution consistency as distinct bottlenecks; and verbose intermediate reasoning can create instruction clash with structured output requirements, disrupting plan generation. Together, these findings highlight a fundamental gap between semantic reasoning and execution-grounded coordination, underscoring the need for planning frameworks that jointly model semantic intent, execution constraints, and workflow consistency. Code, data, and evaluation resources are available at: https://github.com/TamuChen18/DisasterBench_Open
IRApr 6Code
DisastRAG: A Multi-Source Disaster Information Integration and Access System Based on Retrieval-Augmented Large Language ModelsBo Li, Zhitong Chen, Kai Yin et al.
Effective disaster management requires rapid access to information distributed across structured operational records, unstructured institutional documents, and dynamic external sources. However, most existing disaster information systems and retrieval-augmented generation frameworks remain organized around a single access pathway, limiting their ability to support heterogeneous, time-sensitive, and context-dependent information needs. This study presents DisastRAG, a disaster-aware information integration and access system that combines large language models with retrieval-augmented access to structured, unstructured, and contextual disaster information. The framework is built around a multi-path architecture that supports document retrieval over a curated hazard corpus, structured access over relational disaster records, and external web fallback for out-of-corpus requests, while also incorporating query understanding, strategy routing, response generation, and contextual memory within a unified system. We evaluated the document retrieval performance using four open-source large language models across multiple retrieval configurations on multiple-choice and open-ended disaster information tasks. Retrieval augmentation consistently improves performance over no-retrieval baselines, yielding multiple-choice gains of 12-23 percentage points and open-ended keypoint coverage gains of up to 10.5 percentage points. Results show that larger candidate pools are most helpful for weaker models, while stronger models are more sensitive to retrieval noise. Hybrid retrieval performs best for open-ended coverage, whereas vector retrieval and shallower reranking more often favor closed-form factual selection. Case studies further show that structured access and web fallback extend the framework beyond document-only RAG.
CLJan 7Code
DisastQA: A Comprehensive Benchmark for Evaluating Question Answering in Disaster ManagementZhitong Chen, Kai Yin, Xiangjue Dong et al.
Accurate question answering (QA) in disaster management requires reasoning over uncertain and conflicting information, a setting poorly captured by existing benchmarks built on clean evidence. We introduce DisastQA, a large-scale benchmark of 3,000 rigorously verified questions (2,000 multiple-choice and 1,000 open-ended) spanning eight disaster types. The benchmark is constructed via a human-LLM collaboration pipeline with stratified sampling to ensure balanced coverage. Models are evaluated under varying evidence conditions, from closed-book to noisy evidence integration, enabling separation of internal knowledge from reasoning under imperfect information. For open-ended QA, we propose a human-verified keypoint-based evaluation protocol emphasizing factual completeness over verbosity. Experiments with 20 models reveal substantial divergences from general-purpose leaderboards such as MMLU-Pro. While recent open-weight models approach proprietary systems in clean settings, performance degrades sharply under realistic noise, exposing critical reliability gaps for disaster response. All code, data, and evaluation resources are available at https://github.com/TamuChen18/DisastQA_open.
ROOct 17, 2022
Decision-Making Among Bounded Rational AgentsJunhong Xu, Durgakant Pushp, Kai Yin et al.
When robots share the same workspace with other intelligent agents (e.g., other robots or humans), they must be able to reason about the behaviors of their neighboring agents while accomplishing the designated tasks. In practice, frequently, agents do not exhibit absolutely rational behavior due to their limited computational resources. Thus, predicting the optimal agent behaviors is undesirable (because it demands prohibitive computational resources) and undesirable (because the prediction may be wrong). Motivated by this observation, we remove the assumption of perfectly rational agents and propose incorporating the concept of bounded rationality from an information-theoretic view into the game-theoretic framework. This allows the robots to reason other agents' sub-optimal behaviors and act accordingly under their computational constraints. Specifically, bounded rationality directly models the agent's information processing ability, which is represented as the KL-divergence between nominal and optimized stochastic policies, and the solution to the bounded-optimal policy can be obtained by an efficient importance sampling approach. Using both simulated and real-world experiments in multi-robot navigation tasks, we demonstrate that the resulting framework allows the robots to reason about different levels of rational behaviors of other agents and compute a reasonable strategy under its computational constraint.
LGSep 26, 2023
Unsupervised Graph Deep Learning Reveals Emergent Flood Risk Profile of Urban AreasKai Yin, Junwei Ma, Ali Mostafavi
Urban flood risk emerges from complex and nonlinear interactions among multiple features related to flood hazard, flood exposure, and social and physical vulnerabilities, along with the complex spatial flood dependence relationships. Existing approaches for characterizing urban flood risk, however, are primarily based on flood plain maps, focusing on a limited number of features, primarily hazard and exposure features, without consideration of feature interactions or the dependence relationships among spatial areas. To address this gap, this study presents an integrated urban flood-risk rating model based on a novel unsupervised graph deep learning model (called FloodRisk-Net). FloodRisk-Net is capable of capturing spatial dependence among areas and complex and nonlinear interactions among flood hazards and urban features for specifying emergent flood risk. Using data from multiple metropolitan statistical areas (MSAs) in the United States, the model characterizes their flood risk into six distinct city-specific levels. The model is interpretable and enables feature analysis of areas within each flood-risk level, allowing for the identification of the three archetypes shaping the highest flood risk within each MSA. Flood risk is found to be spatially distributed in a hierarchical structure within each MSA, where the core city disproportionately bears the highest flood risk. Multiple cities are found to have high overall flood-risk levels and low spatial inequality, indicating limited options for balancing urban development and flood-risk reduction. Relevant flood-risk reduction strategies are discussed considering ways that the highest flood risk and uneven spatial distribution of flood risk are formed.
SINov 3, 2023
Deep Learning-driven Community Resilience Rating based on Intertwined Socio-Technical Systems FeaturesKai Yin, Bo Li, Ali Mostafavi
Community resilience is a complex and muti-faceted phenomenon that emerges from complex and nonlinear interactions among different socio-technical systems and their resilience properties. However, present studies on community resilience focus primarily on vulnerability assessment and utilize index-based approaches, with limited ability to capture heterogeneous features within community socio-technical systems and their nonlinear interactions in shaping robustness, redundancy, and resourcefulness components of resilience. To address this gap, this paper presents an integrated three-layer deep learning model for community resilience rating (called Resili-Net). Twelve measurable resilience features are specified and computed within community socio-technical systems (i.e., facilities, infrastructures, and society) related to three resilience components of robustness, redundancy, and resourcefulness. Using publicly accessible data from multiple metropolitan statistical areas in the United States, Resili-Net characterizes the resilience levels of spatial areas into five distinct levels. The interpretability of the model outcomes enables feature analysis for specifying the determinants of resilience in areas within each resilience level, allowing for the identification of specific resilience enhancement strategies. Changes in community resilience profiles under urban development patterns are further examined by changing the value of related socio-technical systems features. Accordingly, the outcomes provide novel perspectives for community resilience assessment by harnessing machine intelligence and heterogeneous urban big data.
IRMar 15
FloodSQL-Bench: A Retrieval-Augmented Benchmark for Geospatially-Grounded Text-to-SQLHanzhou Liu, Kai Yin, Zhitong Chen et al.
Existing Text-to-SQL benchmarks primarily focus on single-table queries or limited joins in general-purpose domains, and thus fail to reflect the complexity of domain-specific, multi-table and geospatial reasoning, To address this limitation, we introduce FLOODSQL-BENCH, a geospatially grounded benchmark for the flood management domain that integrates heterogeneous datasets through key-based, spatial, and hybrid joins. The benchmark captures realistic flood-related information needs by combining social, infrastructural, and hazard data layers. We systematically evaluate recent large language models with the same retrieval-augmented generation settings and measure their performance across difficulty tiers. By providing a unified, open benchmark grounded in real-world disaster management data, FLOODSQL-BENCH establishes a practical testbed for advancing Text-to-SQL research in high-stakes application domains.
IRMay 20, 2025Code
DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster ManagementKai Yin, Xiangjue Dong, Chengkai Liu et al.
Effective disaster management requires timely access to accurate and contextually relevant information. Existing Information Retrieval (IR) benchmarks, however, focus primarily on general or specialized domains, such as medicine or finance, neglecting the unique linguistic complexity and diverse information needs encountered in disaster management scenarios. To bridge this gap, we introduce DisastIR, the first comprehensive IR evaluation benchmark specifically tailored for disaster management. DisastIR comprises 9,600 diverse user queries and more than 1.3 million labeled query-passage pairs, covering 48 distinct retrieval tasks derived from six search intents and eight general disaster categories that include 301 specific event types. Our evaluations of 30 state-of-the-art retrieval models demonstrate significant performance variances across tasks, with no single model excelling universally. Furthermore, comparative analyses reveal significant performance gaps between general-domain and disaster management-specific tasks, highlighting the necessity of disaster management-specific benchmarks for guiding IR model selection to support effective decision-making in disaster management scenarios. All source codes and DisastIR are available at https://github.com/KaiYin97/Disaster_IR.
CYJan 30
CrisiSense-RAG: Crisis Sensing Multimodal Retrieval-Augmented Generation for Rapid Disaster Impact AssessmentYiming Xiao, Kai Yin, Ali Mostafavi
Timely and spatially resolved disaster impact assessment is essential for effective emergency response. However, automated methods typically struggle with temporal asynchrony. Real-time human reports capture peak hazard conditions while high-resolution satellite imagery is frequently acquired after peak conditions. This often reflects flood recession rather than maximum extent. Naive fusion of these misaligned streams can yield dangerous underestimates when post-event imagery overrides documented peak flooding. We present CrisiSense-RAG, which is a multimodal retrieval-augmented generation framework that reframes impact assessment as evidence synthesis over heterogeneous data sources without disaster-specific fine-tuning. The system employs hybrid dense-sparse retrieval for text sources and CLIP-based retrieval for aerial imagery. A split-pipeline architecture feeds into asynchronous fusion logic that prioritizes real-time social evidence for peak flood extent while treating imagery as persistent evidence of structural damage. Evaluated on Hurricane Harvey across 207 ZIP-code queries, the framework achieves a flood extent MAE of 10.94% to 28.40% and damage severity MAE of 16.47% to 21.65% in zero-shot settings. Prompt-level alignment proves critical for quantitative validity because metric grounding improves damage estimates by up to 4.75 percentage points. These results demonstrate a practical and deployable approach to rapid resilience intelligence under real-world data constraints.
IROct 16, 2025Code
DMRetriever: A Family of Models for Improved Text Retrieval in Disaster ManagementKai Yin, Xiangjue Dong, Chengkai Liu et al.
Effective and efficient access to relevant information is essential for disaster management. However, no retrieval model is specialized for disaster management, and existing general-domain models fail to handle the varied search intents inherent to disaster management scenarios, resulting in inconsistent and unreliable performance. To this end, we introduce DMRetriever, the first series of dense retrieval models (33M to 7.6B) tailored for this domain. It is trained through a novel three-stage framework of bidirectional attention adaptation, unsupervised contrastive pre-training, and difficulty-aware progressive instruction fine-tuning, using high-quality data generated through an advanced data refinement pipeline. Comprehensive experiments demonstrate that DMRetriever achieves state-of-the-art (SOTA) performance across all six search intents at every model scale. Moreover, DMRetriever is highly parameter-efficient, with 596M model outperforming baselines over 13.3 X larger and 33M model exceeding baselines with only 7.6% of their parameters. All codes, data, and checkpoints are available at https://github.com/KaiYin97/DMRETRIEVER
CLJun 16, 2024Code
CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster InformaticsKai Yin, Bo Li, Chengkai Liu et al.
In the field of crisis/disaster informatics, social media is increasingly being used for improving situational awareness to inform response and relief efforts. Efficient and accurate text classification tools have been a focal area of investigation in crisis informatics. However, current methods mostly rely on single-label text classification models, which fails to capture different insights embedded in dynamic and multifaceted disaster-related social media data. This study introduces a novel approach to disaster text classification by enhancing a pre-trained Large Language Model (LLM) through instruction fine-tuning targeted for multi-label classification of disaster-related tweets. Our methodology involves creating a comprehensive instruction dataset from disaster-related tweets, which is then used to fine-tune an open-source LLM, thereby embedding it with disaster-specific knowledge. This fine-tuned model can classify multiple aspects of disaster-related information simultaneously, such as the type of event, informativeness, and involvement of human aid, significantly improving the utility of social media data for situational awareness in disasters. The results demonstrate that this approach enhances the categorization of critical information from social media posts, thereby facilitating a more effective deployment for situational awareness during emergencies. This research paves the way for more advanced, adaptable, and robust disaster management tools, leveraging the capabilities of LLMs to improve real-time situational awareness and response strategies in disaster scenarios.
LGMay 23, 2024
FloodDamageCast: Building Flood Damage Nowcasting with Machine Learning and Data AugmentationChia-Fu Liu, Lipai Huang, Kai Yin et al.
Near-real time estimation of damage to buildings and infrastructure, referred to as damage nowcasting in this study, is crucial for empowering emergency responders to make informed decisions regarding evacuation orders and infrastructure repair priorities during disaster response and recovery. Here, we introduce FloodDamageCast, a machine learning framework tailored for property flood damage nowcasting. The framework leverages heterogeneous data to predict residential flood damage at a resolution of 500 meters by 500 meters within Harris County, Texas, during the 2017 Hurricane Harvey. To deal with data imbalance, FloodDamageCast incorporates a generative adversarial networks-based data augmentation coupled with an efficient machine learning model. The results demonstrate the model's ability to identify high-damage spatial areas that would be overlooked by baseline models. Insights gleaned from flood damage nowcasting can assist emergency responders to more efficiently identify repair needs, allocate resources, and streamline on-the-ground inspections, thereby saving both time and effort.
AIMar 6, 2024
Rethinking Urban Flood Risk Assessment By Adapting Health Domain PerspectiveZhewei Liu, Kai Yin, Ali Mostafavi
Inspired by ideas from health risk assessment, this paper presents a new perspective for flood risk assessment. The proposed perspective focuses on three pillars for examining flood risk: (1) inherent susceptibility, (2) mitigation strategies, and (3) external stressors. These pillars collectively encompass the physical and environmental characteristics of urban areas, the effectiveness of human-intervention measures, and the influence of uncontrollable external factors, offering a fresh point of view for decoding flood risks. For each pillar, we delineate its individual contributions to flood risk and illustrate their interactive and overall impact. The three-pillars model embodies a shift in focus from the quest to precisely model and quantify flood risk to evaluating pathways to high flood risk. The shift in perspective is intended to alleviate the quest for quantifying and predicting flood risk at fine resolutions as a panacea for enhanced flood risk management. The decomposition of flood risk pathways into the three intertwined pillars (i.e., inherent factors, mitigation factors, and external factors) enables evaluation of changes in factors within each pillar enhance and exacerbate flood risk, creating a platform from which to inform plans, decisions, and actions. Building on this foundation, we argue that a flood risk pathway analysis approach, which examines the individual and collective impacts of inherent factors, mitigation strategies, and external stressors, is essential for a nuanced evaluation of flood risk. Accordingly, the proposed perspective could complement the existing frameworks and approaches for flood risk assessment.
LGFeb 10
R2RAG-Flood: A reasoning-reinforced training-free retrieval augmentation generation framework for flood damage nowcastingLipai Huang, Kai Yin, Chia-Fu Liu et al.
R2RAG-Flood is a reasoning-reinforced, training-free retrieval-augmented generation framework for post-storm property damage nowcasting. Building on an existing supervised tabular predictor, the framework constructs a reasoning-centric knowledge base composed of labeled tabular records, where each sample includes structured predictors, a compact natural language text-mode summary, and a model-generated reasoning trajectory. During inference, R2RAG-Flood issues context-augmented prompts that retrieve and condition on relevant reasoning trajectories from nearby geospatial neighbors and canonical class prototypes, enabling the large language model backbone to emulate and adapt prior reasoning rather than learn new task-specific parameters. Predictions follow a two-stage procedure that first determines property damage occurrence and then refines severity within a three-level Property Damage Extent categorization, with a conditional downgrade step to correct over-predicted severity. In a case study of Harris County, Texas at the 12-digit Hydrologic Unit Code scale, the supervised tabular baseline trained directly on structured predictors achieves 0.714 overall accuracy and 0.859 damage class accuracy for medium and high damage classes. Across seven large language model backbones, R2RAG-Flood attains 0.613 to 0.668 overall accuracy and 0.757 to 0.896 damage class accuracy, approaching the supervised baseline while additionally producing a structured rationale for each prediction. Using a severity-per-cost efficiency metric derived from API pricing and GPU instance costs, lightweight R2RAG-Flood variants demonstrate substantially higher efficiency than both the supervised tabular baseline and larger language models, while requiring no task-specific training or fine-tuning.
MAOct 16, 2025
Disaster Management in the Era of Agentic AI Systems: A Vision for Collective Human-Machine Intelligence for Augmented ResilienceBo Li, Junwei Ma, Kai Yin et al.
The escalating frequency and severity of disasters routinely overwhelm traditional response capabilities, exposing critical vulnerability in disaster management. Current practices are hindered by fragmented data streams, siloed technologies, resource constraints, and the erosion of institutional memory, which collectively impede timely and effective decision making. This study introduces Disaster Copilot, a vision for a multi-agent artificial intelligence system designed to overcome these systemic challenges by unifying specialized AI tools within a collaborative framework. The proposed architecture utilizes a central orchestrator to coordinate diverse sub-agents, each specializing in critical domains such as predictive risk analytics, situational awareness, and impact assessment. By integrating multi-modal data, the system delivers a holistic, real-time operational picture and serve as the essential AI backbone required to advance Disaster Digital Twins from passive models to active, intelligent environments. Furthermore, it ensures functionality in resource-limited environments through on-device orchestration and incorporates mechanisms to capture institutional knowledge, mitigating the impact of staff turnover. We detail the system architecture and propose a three-phased roadmap emphasizing the parallel growth of technology, organizational capacity, and human-AI teaming. Disaster Copilot offers a transformative vision, fostering collective human-machine intelligence to build more adaptive, data-driven and resilient communities.
CVSep 2, 2025
Automated Wildfire Damage Assessment from Multi view Ground level Imagery Via Vision Language ModelsMiguel Esparza, Archit Gupta, Ali Mostafavi et al.
The escalating intensity and frequency of wildfires demand innovative computational methods for rapid and accurate property damage assessment. Traditional methods are often time consuming, while modern computer vision approaches typically require extensive labeled datasets, hindering immediate post-disaster deployment. This research introduces a novel, zero-shot framework leveraging pre-trained vision language models (VLMs) to classify damage from ground-level imagery. We propose and evaluate two pipelines applied to the 2025 Eaton and Palisades fires in California, a VLM (Pipeline A) and a VLM + large language model (LLM) approach (Pipeline B), that integrate structured prompts based on specific wildfire damage indicators. A primary scientific contribution of this study is demonstrating the VLMs efficacy in synthesizing information from multiple perspectives to identify nuanced damage, a critical limitation in existing literature. Our findings reveal that while single view assessments struggled to classify affected structures (F1 scores ranging from 0.225 to 0.511), the multi-view analysis yielded dramatic improvements (F1 scores ranging from 0.857 to 0.947). Moreover, the McNemar test confirmed that pipelines with a multi-view image assessment yields statistically significant classification improvements; however, the improvements this research observed between Pipeline A and B were not statistically significant. Thus, future research can explore the potential of LLM prompting in damage assessment. The practical contribution is an immediately deployable, flexible, and interpretable workflow that bypasses the need for supervised training, significantly accelerating triage and prioritization for disaster response practitioners.
RONov 16, 2021
Kernel-based diffusion approximated Markov decision processes for autonomous navigation and control on unstructured terrainsJunhong Xu, Kai Yin, Zheng Chen et al.
We propose a diffusion approximation method to the continuous-state Markov Decision Processes (MDPs) that can be utilized to address autonomous navigation and control in unstructured off-road environments. In contrast to most decision-theoretic planning frameworks that assume fully known state transition models, we design a method that eliminates such a strong assumption that is often extremely difficult to engineer in reality. We first take the second-order Taylor expansion of the value function. The Bellman optimality equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of the value function, we design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations characterized by a finite set of supporting states. We first validate the proposed method through extensive simulations in 2D obstacle avoidance and 2.5D terrain navigation problems. The results show that the proposed approach leads to a much superior performance over several baselines. We then develop a system that integrates our decision-making framework with onboard perception and conduct real-world experiments in both cluttered indoor and unstructured outdoor environments. The results from the physical systems further demonstrate the applicability of our method in challenging real-world environments.
HCNov 19, 2020
Towards Emotion-Aware User Simulator for Task-Oriented DialogueRui Zhang, Kai Yin, Li Li
The performance of a task-completion dialogue agent usually affects the user experience: when the conversation system yields an unreasonable response, users may feel dissatisfied. Besides, early termination often occurs in disappointing conversations. However, existing off-the-shelf user simulators generally assume an ideal and cooperative user, which is somewhat different from a real user, and inevitably lead to a sub-optimal dialogue policy. In this paper, we propose an emotion-aware user simulation framework for task-oriented dialogue, which is based on the OCC emotion model to update user emotions and drive user actions, to generate simulated behaviors that more similar to real users. We present a linear implementation (The source code will be released soon.) that is easy to understand and extend, and evaluate it on two domain-specific datasets. The experimental results show that the emotional simulation results of our proposed framework conform to common sense and have good versatility for different domains. Meanwhile, our framework provides us with another perspective to understand the improvement process of the dialogue policy model based on reinforcement learning.
ROSep 8, 2020
Online Planning in Uncertain and Dynamic Environment in the Presence of Multiple Mobile VehiclesJunhong Xu, Kai Yin, Lantao Liu
We investigate the autonomous navigation of a mobile robot in the presence of other moving vehicles under time-varying uncertain environmental disturbances. We first predict the future state distributions of other vehicles to account for their uncertain behaviors affected by the time-varying disturbances. We then construct a dynamic-obstacle-aware reachable space that contains states with high probabilities to be reached by the robot, within which the optimal policy is searched. Since, in general, the dynamics of both the vehicle and the environmental disturbances are nonlinear, we utilize a nonlinear Gaussian filter -- the unscented transform -- to approximate the future state distributions. Finally, the forward reachable space computation and backward policy search are iterated until convergence. Extensive simulation evaluations have revealed significant advantages of this proposed method in terms of computation time, decision accuracy, and planning reliability.
ROJun 3, 2020
Kernel Taylor-Based Value Function Approximation for Continuous-State Markov Decision ProcessesJunhong Xu, Kai Yin, Lantao Liu
We propose a principled kernel-based policy iteration algorithm to solve the continuous-state Markov Decision Processes (MDPs). In contrast to most decision-theoretic planning frameworks, which assume fully known state transition models, we design a method that eliminates such a strong assumption, which is oftentimes extremely difficult to engineer in reality. To achieve this, we first apply the second-order Taylor expansion of the value function. The Bellman optimality equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of value function, we then design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations characterized by a finite set of supporting states. We have validated the proposed method through extensive simulations in both simplified and realistic planning scenarios, and the experiments show that our proposed approach leads to a much superior performance over several baseline methods.
ROMay 22, 2019
Reachable Space Characterization of Markov Decision Processes with Time VariabilityJunhong Xu, Kai Yin, Lantao Liu
We propose a solution to a time-varying variant of Markov Decision Processes which can be used to address decision-theoretic planning problems for autonomous systems operating in unstructured outdoor environments. We explore the time variability property of the planning stochasticity and investigate the state reachability, based on which we then develop an efficient iterative method that offers a good trade-off between solution optimality and time complexity. The reachability space is constructed by analyzing the means and variances of states' reaching time in the future. We validate our algorithm through extensive simulations using ocean data, and the results show that our method achieves a great performance in terms of both solution quality and computing time.
ROMar 3, 2019
State-Continuity Approximation of Markov Decision Processes via Finite Element Methods for Autonomous System PlanningJunhong Xu, Kai Yin, Lantao Liu
Motion planning under uncertainty for an autonomous system can be formulated as a Markov Decision Process with a continuous state space. In this paper, we propose a novel solution to this decision-theoretic planning problem that directly obtains the continuous value function with only the first and second moments of the transition probabilities, alleviating the requirement for an explicit transition model in the literature. We achieve this by expressing the value function as a linear combination of basis functions and approximating the Bellman equation by a partial differential equation, where the value function can be naturally constructed using a finite element method. We have validated our approach via extensive simulations, and the evaluations reveal that to baseline methods, our solution leads to in terms of path smoothness, travel distance, and time costs.
ROMar 11, 2018
Learning Partially Structured Environmental Dynamics for Marine Robotic NavigationChen Huang, Kai Yin, Lantao Liu
We investigate the scenario that a robot needs to reach a designated goal after taking a sequence of appropriate actions in a non-static environment that is partially structured. One application example is to control a marine vehicle to move in the ocean. The ocean environment is dynamic and oftentimes the ocean waves result in strong disturbances that can disturb the vehicle's motion. Modeling such dynamic environment is non-trivial, and integrating such model in the robotic motion control is particularly difficult. Fortunately, the ocean currents usually form some local patterns (e.g. vortex) and thus the environment is partially structured. The historically observed data can be used to train the robot to learn to interact with the ocean tidal disturbances. In this paper we propose a method that applies the deep reinforcement learning framework to learn such partially structured complex disturbances. Our results show that, by training the robot under artificial and real ocean disturbances, the robot is able to successfully act in complex and spatiotemporal environments.
AINov 24, 2016
A Spatio-Temporal Representation for the Orienteering Problem with Time-Varying ProfitsZhibei Ma, Kai Yin, Lantao Liu et al.
We consider an orienteering problem (OP) where an agent needs to visit a series (possibly a subset) of depots, from which the maximal accumulated profits are desired within given limited time budget. Different from most existing works where the profits are assumed to be static, in this work we investigate a variant that has arbitrary time-dependent profits. Specifically, the profits to be collected change over time and they follow different (e.g., independent) time-varying functions. The problem is of inherent nonlinearity and difficult to solve by existing methods. To tackle the challenge, we present a simple and effective framework that incorporates time-variations into the fundamental planning process. Specifically, we propose a deterministic spatio-temporal representation where both spatial description and temporal logic are unified into one routing topology. By employing existing basic sorting and searching algorithms, the routing solutions can be computed in an extremely efficient way. The proposed method is easy to implement and extensive numerical results show that our approach is time efficient and generates near-optimal solutions.