Wei Xiao

SY
h-index60
62papers
4,264citations
Novelty54%
AI Score59

62 Papers

ROFeb 28, 2023Code
Learned Risk Metric Maps for Kinodynamic Systems

Ross Allen, Wei Xiao, Daniela Rus

We present Learned Risk Metric Maps (LRMM) for real-time estimation of coherent risk metrics of high dimensional dynamical systems operating in unstructured, partially observed environments. LRMM models are simple to design and train -- requiring only procedural generation of obstacle sets, state and control sampling, and supervised training of a function approximator -- which makes them broadly applicable to arbitrary system dynamics and obstacle sets. In a parallel autonomy setting, we demonstrate the model's ability to rapidly infer collision probabilities of a fast-moving car-like robot driving recklessly in an obstructed environment; allowing the LRMM agent to intervene, take control of the vehicle, and avoid collisions. In this time-critical scenario, we show that LRMMs can evaluate risk metrics 20-100x times faster than alternative safety algorithms based on control barrier functions (CBFs) and Hamilton-Jacobi reachability (HJ-reach), leading to 5-15\% fewer obstacle collisions by the LRMM agent than CBFs and HJ-reach. This performance improvement comes in spite of the fact that the LRMM model only has access to local/partial observation of obstacles, whereas the CBF and HJ-reach agents are granted privileged/global information. We also show that our model can be equally well trained on a 12-dimensional quadrotor system operating in an obstructed indoor environment. The LRMM codebase is provided at https://github.com/mit-drl/pyrmm.

CLMar 3, 2022
QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition

Andy T. Liu, Wei Xiao, Henghui Zhu et al. · amazon-science, meta-ai

Recently, prompt-based learning for pre-trained language models has succeeded in few-shot Named Entity Recognition (NER) by exploiting prompts as task guidance to increase label efficiency. However, previous prompt-based methods for few-shot NER have limitations such as a higher computational complexity, poor zero-shot ability, requiring manual prompt engineering, or lack of prompt robustness. In this work, we address these shortcomings by proposing a new prompt-based learning NER method with Question Answering (QA), called QaNER. Our approach includes 1) a refined strategy for converting NER problems into the QA formulation; 2) NER prompt generation for QA models; 3) prompt-based tuning with QA models on a few annotated NER examples; 4) zero-shot NER by prompting the QA model. Comparing the proposed approach with previous methods, QaNER is faster at inference, insensitive to the prompt quality, and robust to hyper-parameters, as well as demonstrating significantly better low-resource performance and zero-shot capability.

CLJan 25, 2023
SWING: Balancing Coverage and Faithfulness for Dialogue Summarization

Kung-Hsiang Huang, Siffi Singh, Xiaofei Ma et al. · amazon-science

Missing information is a common issue of dialogue summarization where some information in the reference summaries is not covered in the generated summaries. To address this issue, we propose to utilize natural language inference (NLI) models to improve coverage while avoiding introducing factual inconsistencies. Specifically, we use NLI to compute fine-grained training signals to encourage the model to generate content in the reference summaries that have not been covered, as well as to distinguish between factually consistent and inconsistent generated sentences. Experiments on the DialogSum and SAMSum datasets confirm the effectiveness of the proposed approach in balancing coverage and faithfulness, validated with automatic metrics and human evaluations. Additionally, we compute the correlation between commonly used automatic metrics with human judgments in terms of three different dimensions regarding coverage and factual consistency to provide insight into the most suitable metric for evaluating dialogue summaries.

CLMay 26, 2022
Learning Dialogue Representations from Consecutive Utterances

Zhihan Zhou, Dejiao Zhang, Wei Xiao et al. · amazon-science

Learning high-quality dialogue representations is essential for solving a variety of dialogue-oriented tasks, especially considering that dialogue systems often suffer from data scarcity. In this paper, we introduce Dialogue Sentence Embedding (DSE), a self-supervised contrastive learning method that learns effective dialogue representations suitable for a wide range of dialogue tasks. DSE learns from dialogues by taking consecutive utterances of the same dialogue as positive pairs for contrastive learning. Despite its simplicity, DSE achieves significantly better representation capability than other dialogue representation and universal sentence representation models. We evaluate DSE on five downstream dialogue tasks that examine dialogue representation at different semantic granularities. Experiments in few-shot and zero-shot settings show that DSE outperforms baselines by a large margin. For example, it achieves 13% average performance improvement over the strongest unsupervised baseline in 1-shot intent classification on 6 datasets. We also provide analyses on the benefits and limitations of our model.

ROOct 26, 2023
Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models

Tsun-Hsuan Wang, Alaa Maalouf, Wei Xiao et al.

As autonomous driving technology matures, end-to-end methodologies have emerged as a leading strategy, promising seamless integration from perception to control via deep learning. However, existing systems grapple with challenges such as unexpected open set environments and the complexity of black-box models. At the same time, the evolution of deep learning introduces larger, multimodal foundational models, offering multi-modal visual and textual understanding. In this paper, we harness these multimodal foundation models to enhance the robustness and adaptability of autonomous driving systems, enabling out-of-distribution, end-to-end, multimodal, and more explainable autonomy. Specifically, we present an approach to apply end-to-end open-set (any environment/scene) autonomous driving that is capable of providing driving decisions from representations queryable by image and text. To do so, we introduce a method to extract nuanced spatial (pixel/patch-aligned) features from transformers to enable the encapsulation of both spatial and semantic features. Our approach (i) demonstrates unparalleled results in diverse tests while achieving significantly greater robustness in out-of-distribution situations, and (ii) allows the incorporation of latent space simulation (via text) for improved training (data augmentation via text) and policy debugging. We encourage the reader to check our explainer video at https://www.youtube.com/watch?v=4n-DJf8vXxo&feature=youtu.be and to view the code and demos on our project webpage at https://drive-anywhere.github.io/.

SYApr 12, 2023
Learning Robust and Correct Controllers from Signal Temporal Logic Specifications Using BarrierNet

Wenliang Liu, Wei Xiao, Calin Belta

In this paper, we consider the problem of learning a neural network controller for a system required to satisfy a Signal Temporal Logic (STL) specification. We exploit STL quantitative semantics to define a notion of robust satisfaction. Guaranteeing the correctness of a neural network controller, i.e., ensuring the satisfaction of the specification by the controlled system, is a difficult problem that received a lot of attention recently. We provide a general procedure to construct a set of trainable High Order Control Barrier Functions (HOCBFs) enforcing the satisfaction of formulas in a fragment of STL. We use the BarrierNet, implemented by a differentiable Quadratic Program (dQP) with HOCBF constraints, as the last layer of the neural network controller, to guarantee the satisfaction of the STL formulas. We train the HOCBFs together with other neural network parameters to further improve the robustness of the controller. Simulation results demonstrate that our approach ensures satisfaction and outperforms existing algorithms.

OCMar 10, 2023
Learning Feasibility Constraints for Control Barrier Functions

Wei Xiao, Christos G. Cassandras, Calin A. Belta

It has been shown that optimizing quadratic costs while stabilizing affine control systems to desired (sets of) states subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). In this paper, we employ machine learning techniques to ensure the feasibility of these QPs, which is a challenging problem, especially for high relative degree constraints where High Order CBFs (HOCBFs) are required. To this end, we propose a sampling-based learning approach to learn a new feasibility constraint for CBFs; this constraint is then enforced by another HOCBF added to the QPs. The accuracy of the learned feasibility constraint is recursively improved by a recurrent training algorithm. We demonstrate the advantages of the proposed learning approach to constrained optimal control problems with specific focus on a robot control problem and on autonomous driving in an unknown environment.

ROMar 4, 2022
Differentiable Control Barrier Functions for Vision-based End-to-End Autonomous Driving

Wei Xiao, Tsun-Hsuan Wang, Makram Chahine et al.

Guaranteeing safety of perception-based learning systems is challenging due to the absence of ground-truth state information unlike in state-aware control scenarios. In this paper, we introduce a safety guaranteed learning framework for vision-based end-to-end autonomous driving. To this end, we design a learning system equipped with differentiable control barrier functions (dCBFs) that is trained end-to-end by gradient descent. Our models are composed of conventional neural network architectures and dCBFs. They are interpretable at scale, achieve great test performance under limited training data, and are safety guaranteed in a series of autonomous driving scenarios such as lane keeping and obstacle avoidance. We evaluated our framework in a sim-to-real environment, and tested on a real autonomous car, achieving safe lane following and obstacle avoidance via Augmented Reality (AR) and real parked vehicles.

SYApr 8
Sampling-Aware Control Barrier Functions for Safety-Critical and Finite-Time Constrained Control

Shuo Liu, Wei Xiao, Calin A. Belta

In safety-critical control systems, ensuring both safety and feasibility under sampled-data implementations is crucial for practical deployment. Existing Control Barrier Function (CBF) frameworks, such as High-Order CBFs (HOCBFs), effectively guarantee safety in continuous time but may become unsafe when executed under zero-order-hold (ZOH) controllers due to inter-sampling effects. Moreover, they do not explicitly handle finite-time reach-and-remain requirements or multiple simultaneous constraints, which often lead to conflicts between safety and reach-and-remain objectives, resulting in feasibility issues during control synthesis. This paper introduces Sampling-Aware Control Barrier Functions (SACBFs), a unified framework that accounts for sampling effects and high relative-degree constraints by estimating and incorporating Taylor-based upper bounds on barrier evolution between sampling instants. The proposed method guarantees continuous-time forward invariance of safety and finite-time reach-and-remain sets under ZOH control. To further improve feasibility, a relaxed variant (r-SACBF) introduces slack variables for handling multiple constraints realized through time-varying CBFs. Simulation studies on a unicycle robot demonstrate that SACBFs achieve safe and feasible performance in scenarios where traditional HOCBF methods fail.

SYApr 1
Event-Triggered Adaptive Taylor-Lagrange Control for Safety-Critical Systems

Shuo Liu, Wei Xiao, Christos G. Cassandras et al.

This paper studies safety-critical control for nonlinear systems under sampled-data implementations of the controller. The recently proposed Taylor--Lagrange Control (TLC) method provides rigorous safety guarantees but relies on a fixed discretization-related parameter, which can lead to infeasibility or unsafety in the presence of input constraints and inter-sampling effects. To address these limitations, we propose an adaptive Taylor--Lagrange Control (aTLC) framework with an event-triggered implementation, where the discretization-related parameter defines the discretization time scale and is selected online as state-dependent rather than fixed. This enables the controller to dynamically balance feasibility and safety by adjusting the effective time scale of the Taylor expansion. The resulting controller is implemented as a sequence of Quadratic Programs (QPs) with input constraints. We further introduce a selection rule to choose the discretization-related parameter from a finite candidate set, favoring feasible inputs and improved safety. Simulation results on an adaptive cruise control (ACC) problem demonstrate that the proposed approach improves feasibility, guarantees safety, and achieves smoother control actions compared to TLC while requiring a single automatically tuned parameter.

LGOct 10, 2022
On the Forward Invariance of Neural ODEs

Wei Xiao, Tsun-Hsuan Wang, Ramin Hasani et al.

We propose a new method to ensure neural ordinary differential equations (ODEs) satisfy output specifications by using invariance set propagation. Our approach uses a class of control barrier functions to transform output specifications into constraints on the parameters and inputs of the learning system. This setup allows us to achieve output specification guarantees simply by changing the constrained parameters/inputs both during training and inference. Moreover, we demonstrate that our invariance set propagation through data-controlled neural ODEs not only maintains generalization performance but also creates an additional degree of robustness by enabling causal manipulation of the system's parameters/inputs. We test our method on a series of representation learning tasks, including modeling physical dynamics and convexity portraits, as well as safe collision avoidance for autonomous vehicles.

SYDec 7, 2025
Joint Learning of Feasibility-Aware Signal Temporal Logic and BarrierNet for Robust and Correct Control

Shuo Liu, Wenliang Liu, Wei Xiao et al.

Control Barrier Functions (CBFs) have emerged as a powerful tool for enforcing safety in optimization-based controllers, and their integration with Signal Temporal Logic (STL) has enabled the specification-driven synthesis of complex robotic behaviors. However, existing CBF-STL approaches typically rely on fixed hyperparameters and myopic, per-time step optimization, which can lead to overly conservative behavior, infeasibility near tight input limits, and difficulty satisfying long-horizon STL tasks. To address these limitations, we propose a feasibility-aware learning framework that embeds trainable, time-varying High Order Control Barrier Functions (HOCBFs) into a differentiable Quadratic Program (dQP). Our approach provides a systematic procedure for constructing time-varying HOCBF constraints for a broad fragment of STL and introduces a unified robustness measure that jointly captures STL satisfaction, QP feasibility, and control-bound compliance. Three neural networks-InitNet, RefNet, and an extended BarrierNet-collaborate to generate reference inputs and adapt constraint-related hyperparameters automatically over time and across initial conditions, reducing conservativeness while maximizing robustness. The resulting controller achieves STL satisfaction with strictly feasible dQPs and requires no manual tuning. Simulation results demonstrate that the proposed framework maintains high STL robustness under tight input bounds and significantly outperforms fixed-parameter and non-adaptive baselines in complex environments.

ROApr 5, 2023
Learning Stability Attention in Vision-based End-to-end Driving Policies

Tsun-Hsuan Wang, Wei Xiao, Makram Chahine et al.

Modern end-to-end learning systems can learn to explicitly infer control from perception. However, it is difficult to guarantee stability and robustness for these systems since they are often exposed to unstructured, high-dimensional, and complex observation spaces (e.g., autonomous driving from a stream of pixel inputs). We propose to leverage control Lyapunov functions (CLFs) to equip end-to-end vision-based policies with stability properties and introduce stability attention in CLFs (att-CLFs) to tackle environmental changes and improve learning flexibility. We also present an uncertainty propagation technique that is tightly integrated into att-CLFs. We demonstrate the effectiveness of att-CLFs via comparison with classical CLFs, model predictive control, and vanilla end-to-end learning in a photo-realistic simulator and on a real full-scale autonomous vehicle.

CVSep 18, 2024
Optical Flow Matters: an Empirical Comparative Study on Fusing Monocular Extracted Modalities for Better Steering

Fouad Makiyeh, Mark Bastourous, Anass Bairouk et al.

Autonomous vehicle navigation is a key challenge in artificial intelligence, requiring robust and accurate decision-making processes. This research introduces a new end-to-end method that exploits multimodal information from a single monocular camera to improve the steering predictions for self-driving cars. Unlike conventional models that require several sensors which can be costly and complex or rely exclusively on RGB images that may not be robust enough under different conditions, our model significantly improves vehicle steering prediction performance from a single visual sensor. By focusing on the fusion of RGB imagery with depth completion information or optical flow data, we propose a comprehensive framework that integrates these modalities through both early and hybrid fusion techniques. We use three distinct neural network models to implement our approach: Convolution Neural Network - Neutral Circuit Policy (CNN-NCP) , Variational Auto Encoder - Long Short-Term Memory (VAE-LSTM) , and Neural Circuit Policy architecture VAE-NCP. By incorporating optical flow into the decision-making process, our method significantly advances autonomous navigation. Empirical results from our comparative study using Boston driving data show that our model, which integrates image and motion information, is robust and reliable. It outperforms state-of-the-art approaches that do not use optical flow, reducing the steering estimation error by 31%. This demonstrates the potential of optical flow data, combined with advanced neural network architectures (a CNN-based structure for fusing data and a Recurrence-based network for inferring a command from latent space), to enhance the performance of autonomous vehicles steering estimation.

LGOct 13, 2022
Interpreting Neural Policies with Disentangled Tree Representations

Tsun-Hsuan Wang, Wei Xiao, Tim Seyde et al.

The advancement of robots, particularly those functioning in complex human-centric environments, relies on control solutions that are driven by machine learning. Understanding how learning-based controllers make decisions is crucial since robots are often safety-critical systems. This urges a formal and quantitative understanding of the explanatory factors in the interpretability of robot learning. In this paper, we aim to study interpretability of compact neural policies through the lens of disentangled representation. We leverage decision trees to obtain factors of variation [1] for disentanglement in robot learning; these encapsulate skills, behaviors, or strategies toward solving tasks. To assess how well networks uncover the underlying task dynamics, we introduce interpretability metrics that measure disentanglement of learned neural dynamics from a concentration of decisions, mutual information and modularity perspective. We showcase the effectiveness of the connection between interpretability and disentanglement consistently across extensive experimental analysis.

CVSep 16, 2024
Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference

Huy-Dung Nguyen, Anass Bairouk, Mirjana Maras et al.

Autonomous driving systems require a comprehensive understanding of the environment, achieved by extracting visual features essential for perception, planning, and control. However, models trained solely on single-task objectives or generic datasets often lack the contextual information needed for robust performance in complex driving scenarios. In this work, we propose a unified encoder trained on multiple computer vision tasks crucial for urban driving, including depth, pose, and 3D scene flow estimation, as well as semantic, instance, panoptic, and motion segmentation. By integrating these diverse visual cues-similar to human perceptual mechanisms-the encoder captures rich features that enhance navigation-related predictions. We evaluate the model on steering estimation as a downstream task, leveraging its dense latent space. To ensure efficient multi-task learning, we introduce a multi-scale feature network for pose estimation and apply knowledge distillation from a multi-backbone teacher model. Our findings highlight two key findings: (1) the unified encoder achieves competitive performance across all visual perception tasks, demonstrating strong generalization capabilities; and (2) for steering estimation, the frozen unified encoder-leveraging dense latent representations-outperforms both its fine-tuned counterpart and the same frozen model pretrained on generic datasets like ImageNet. These results underline the significance of task-specific visual features and demonstrate the promise of multi-task learning in advancing autonomous driving systems. More details and the pretrained model are available at https://hi-computervision.github.io/uni-encoder/.

SYMar 23
Finite-time Convergent Control Barrier Functions with Feasibility Guarantees

Anni Li, Yingqing Chen, Christos G. Cassandras et al.

This paper studies the problem of finite-time convergence to a prescribed safe set for nonlinear systems whose initial states violate the safety constraints. Existing Control Lyapunov-Barrier Functions (CLBFs) can enforce recovery to the safe set but may suffer from the issue of chattering and they do not explicitly consider control bounds. To address these limitations, we propose a new Control Barrier Function (CBF) formulation that guarantees finite-time convergence to the safe set while ensuring feasibility under control constraints. Specifically, we strengthen the initially violated safety constraint by introducing a parameter which enables the exploitation of the asymptotic property of a CBF to converge to the safe set in finite time. Furthermore, the conditions for the existence of such a CBF under control bounds to achieve finite-time convergence are derived via reachability analysis and constraint comparison, providing a systematic approach for parameter design. A case study on 2D obstacle avoidance is presented to demonstrate the effectiveness and advantages of the proposed method.

SYSep 6, 2023
Safe Neural Control for Non-Affine Control Systems with Differentiable Control Barrier Functions

Wei Xiao, Ross Allen, Daniela Rus

This paper addresses the problem of safety-critical control for non-affine control systems. It has been shown that optimizing quadratic costs subject to state and control constraints can be sub-optimally reduced to a sequence of quadratic programs (QPs) by using Control Barrier Functions (CBFs). Our recently proposed High Order CBFs (HOCBFs) can accommodate constraints of arbitrary relative degree. The main challenges in this approach are that it requires affine control dynamics and the solution of the CBF-based QP is sub-optimal since it is solved point-wise. To address these challenges, we incorporate higher-order CBFs into neural ordinary differential equation-based learning models as differentiable CBFs to guarantee safety for non-affine control systems. The differentiable CBFs are trainable in terms of their parameters, and thus, they can address the conservativeness of CBFs such that the system state will not stay unnecessarily far away from safe set boundaries. Moreover, the imitation learning model is capable of learning complex and optimal control policies that are usually intractable online. We illustrate the effectiveness of the proposed framework on LiDAR-based autonomous driving and compare it with existing methods.

SYFeb 23
Robust Taylor-Lagrange Control for Safety-Critical Systems

Wei Xiao, Christos Cassandras, Anni Li

Solving safety-critical control problem has widely adopted the Control Barrier Function (CBF) method. However, the existence of a CBF is only a sufficient condition for system safety. The recently proposed Taylor-Lagrange Control (TLC) method addresses this limitation, but is vulnerable to the feasibility preservation problem (e.g., inter-sampling effect). In this paper, we propose a robust TLC (rTLC) method to address the feasibility preservation problem. Specifically, the rTLC method expands the safety function at an order higher than the relative degree of the function using Taylor's expansion with Lagrange remainder, which allows the control to explicitly show up at the current time instead of the future time in the TLC method. The rTLC method naturally addresses the feasibility preservation problem with only one hyper-parameter (the discretization time interval size during implementation), which is much less than its counterparts. Finally, we illustrate the effectiveness of the proposed rTLC method through an adaptive cruise control problem, and compare it with existing safety-critical control methods.

SYMar 19
Exact-Time Safety Recovery using Time-Varying Control Barrier Functions with Optimal Barrier Tracking

Yingqing Chen, Christos G. Cassandras, Wei Xiao et al.

This paper is motivated by controllers developed for autonomous vehicles which occasionally result into conditions where safety is no longer guaranteed. We develop an exact-time safety recovery framework for any control-affine nonlinear system when its state is outside a safe region using time-varying Control Barrier Functions (CBFs) with optimal barrier tracking. Unlike conventional formulations that provide only conservative upper bounds on recovery time convergence, the proposed approach guarantees recovery to the safe set at a prescribed time. The key mechanism is an active barrier tracking condition that forces the barrier function to follow exactly a designer-specified recovery trajectory. This transforms safety recovery into a trajectory design problem. The recovery trajectory is parameterized and optimized to achieve optimal performance while preserving feasibility under input constraints, avoiding the aggressive corrective actions typically induced by conventional finite-time formulations. The safety recovery framework is applied to the roundabout traffic coordination problem for Connected and Automated Vehicles (CAVs), where any initially violated safe merging constraint is replaced by an exact-time recovery barrier constraint to ensure safety guarantee restoration before CAV conflict points are reached. Simulation results demonstrate improved feasibility and performance.

LGJan 29
PoSafeNet: Safe Learning with Poset-Structured Neural Nets

Kiwan Wong, Wei Xiao, Daniela Rus

Safe learning is essential for deploying learningbased controllers in safety-critical robotic systems, yet existing approaches often enforce multiple safety constraints uniformly or via fixed priority orders, leading to infeasibility and brittle behavior. In practice, safety requirements are heterogeneous and admit only partial priority relations, where some constraints are comparable while others are inherently incomparable. We formalize this setting as poset-structured safety, modeling safety constraints as a partially ordered set and treating safety composition as a structural property of the policy class. Building on this formulation, we propose PoSafeNet, a differentiable neural safety layer that enforces safety via sequential closed-form projection under poset-consistent constraint orderings, enabling adaptive selection or mixing of valid safety executions while preserving priority semantics by construction. Experiments on multi-obstacle navigation, constrained robot manipulation, and vision-based autonomous driving demonstrate improved feasibility, robustness, and scalability over unstructured and differentiable quadratic program-based safety layers.

ROMar 19
A Closed-Form CLF-CBF Controller for Whole-Body Continuum Soft Robot Collision Avoidance

Kiwan Wong, Maximillian Stölzle, Wei Xiao et al.

Safe operation is essential for deploying robots in human-centered 3D environments. Soft continuum manipulators provide passive safety through mechanical compliance, but still require active control to achieve reliable collision avoidance. Existing approaches, such as sampling-based planning, are often computationally expensive and lack formal safety guarantees, which limits their use for real-time whole-body avoidance. This paper presents a closed-form Control Lyapunov Function--Control Barrier Function (CLF--CBF) controller for real-time 3D obstacle avoidance in soft continuum manipulators without online optimization. By analytically embedding safety constraints into the control input, the proposed method ensures stability and safety under the stated modeling assumptions, while avoiding feasibility issues commonly encountered in online optimization-based methods. The resulting controller is up to $10\times$ faster than standard CLF--CBF quadratic-programming approaches and up to $100\times$ faster than traditional sampling-based planners. Simulation and hardware experiments on a tendon-driven soft manipulator demonstrate accurate 3D trajectory tracking and robust obstacle avoidance in cluttered environments. These results show that the proposed framework provides a scalable and provably safe control strategy for soft robots operating in dynamic, safety-critical settings.

LGSep 29, 2025Code
DyMoDreamer: World Modeling with Dynamic Modulation

Boxuan Zhang, Runqing Wang, Wei Xiao et al.

A critical bottleneck in deep reinforcement learning (DRL) is sample inefficiency, as training high-performance agents often demands extensive environmental interactions. Model-based reinforcement learning (MBRL) mitigates this by building world models that simulate environmental dynamics and generate synthetic experience, improving sample efficiency. However, conventional world models process observations holistically, failing to decouple dynamic objects and temporal features from static backgrounds. This approach is computationally inefficient, especially for visual tasks where dynamic objects significantly influence rewards and decision-making performance. To address this, we introduce DyMoDreamer, a novel MBRL algorithm that incorporates a dynamic modulation mechanism to improve the extraction of dynamic features and enrich the temporal information. DyMoDreamer employs differential observations derived from a novel inter-frame differencing mask, explicitly encoding object-level motion cues and temporal dynamics. Dynamic modulation is modeled as stochastic categorical distributions and integrated into a recurrent state-space model (RSSM), enhancing the model's focus on reward-relevant dynamics. Experiments demonstrate that DyMoDreamer sets a new state-of-the-art on the Atari $100$k benchmark with a $156.6$\% mean human-normalized score, establishes a new record of $832$ on the DeepMind Visual Control Suite, and gains a $9.5$\% performance improvement after $1$M steps on the Crafter benchmark. Our code is released at https://github.com/Ultraman-Tiga1/DyMoDreamer.

CVMay 21, 2020Code
When Dictionary Learning Meets Deep Learning: Deep Dictionary Learning and Coding Network for Image Recognition with Limited Data

Hao Tang, Hong Liu, Wei Xiao et al.

We present a new Deep Dictionary Learning and Coding Network (DDLCN) for image recognition tasks with limited data. The proposed DDLCN has most of the standard deep learning layers (e.g., input/output, pooling, fully connected, etc.), but the fundamental convolutional layers are replaced by our proposed compound dictionary learning and coding layers. The dictionary learning learns an over-complete dictionary for input training data. At the deep coding layer, a locality constraint is added to guarantee that the activated dictionary bases are close to each other. Then the activated dictionary atoms are assembled and passed to the compound dictionary learning and coding layers. In this way, the activated atoms in the first layer can be represented by the deeper atoms in the second dictionary. Intuitively, the second dictionary is designed to learn the fine-grained components shared among the input dictionary atoms, thus a more informative and discriminative low-level representation of the dictionary atoms can be obtained. We empirically compare DDLCN with several leading dictionary learning methods and deep learning models. Experimental results on five popular datasets show that DDLCN achieves competitive results compared with state-of-the-art methods when the training data is limited. Code is available at https://github.com/Ha0Tang/DDLCN.

CVJan 15, 2019Code
Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

Hao Tang, Hong Liu, Wei Xiao et al.

Gesture recognition is a hot topic in computer vision and pattern recognition, which plays a vitally important role in natural human-computer interface. Although great progress has been made recently, fast and robust hand gesture recognition remains an open problem, since the existing methods have not well balanced the performance and the efficiency simultaneously. To bridge it, this work combines image entropy and density clustering to exploit the key frames from hand gesture video for further feature extraction, which can improve the efficiency of recognition. Moreover, a feature fusion strategy is also proposed to further improve feature representation, which elevates the performance of recognition. To validate our approach in a "wild" environment, we also introduce two new datasets called HandGesture and Action3D datasets. Experiments consistently demonstrate that our strategy achieves competitive results on Northwestern University, Cambridge, HandGesture and Action3D hand gesture datasets. Our code and datasets will release at https://github.com/Ha0Tang/HandGestureRecognition.

LGApr 1
Reservoir observer enhanced with residual calibration and attention mechanism

Yichen Liu, Wei Xiao, Tianguang Chu

Reservoir observers provide a data-driven approach to the inference of unmeasured variables from observed ones for nonlinear dynamical systems. While previous studies have demonstrated wide applicability, their performance may vary considerably with different input variables, even compromising reliability in the worst cases. To enhance the performance of inference, we integrate residual calibration and attention mechanism into the reservoir observer design. The residual calibration module leverages information from the estimation residuals to refine the observer output, and the attention mechanism exploits the temporal dependencies of the data to enrich the representation of reservoir internal dynamics. Experiments on typical chaotic systems demonstrate that our method substantially improves inference accuracy, especially for the worst cases resulting from the traditional reservoir observers. We also invoke the notion of transfer entropy to explain the reason for the input-dependent observation discrepancy and the effectiveness of the proposed method.

HCMar 14
Designing for Understanding: How Interface-Level Consent Designs Shape Attention and Understanding in Privacy Disclosures

Wei Xiao, Mengke Wu, Yeeun Jo

Privacy policies are intended to support informed consent, yet users rarely read them fully. This study examines how common privacy policy interface structures influence attention allocation, reading behavior, and perceived experience. Using eye-tracking and post-task surveys, we compared three interface designs: continuous scrolling text, collapsible sections, and collapsible sections with brief previews. Results show that interface structure systematically shaped how users allocated attention and navigated policy content, but did not uniformly improve comprehension. Guided layouts supported more efficient and coherent reading patterns, whereas more interactive designs elicited higher perceived engagement and satisfaction. Importantly, comprehension was closely linked to sustained attention rather than interface type alone. These findings highlight the limits of interface-centered consent approaches and suggest that effective consent design must account for attention dynamics and selective engagement, rather than assuming that improved layout alone ensures understanding.

SYApr 10
Synthesizing Safety in Infinite-Horizon Optimal Control for Disturbed High-Relative-Degree Systems via Barrier-Regulating Auxiliary Variables

Zhanglin Shangguan, Wei Xiao, Qi Li et al.

Optimal stabilization of safety-critical nonlinear systems requires balancing long-term performance and strict safety constraints. Existing quadratic-programming-based control barrier function (CBF) safety filters are point-wise and may exhibit myopic behavior and local trapping when the safeguarding action conflicts with the nominal optimal control. This paper develops a safety-aware infinite-horizon optimal control framework by embedding a barrier-Lyapunov function (BLF)-based safeguarding action into the system dynamics and introducing a barrier-regulating auxiliary variable, thereby reformulating the original constrained problem as an unconstrained one on an extended state space. To mitigate local trapping, we introduce an adaptive alignment-conditioned tangential excitation orthogonal to the safety direction, with activation adaptively modulated by the degree of directional alignment between the nominal and safeguarding controllers, and incorporate it as an admissible $\mathcal{L}2$ disturbance in an $H\infty$ formulation. For high-relative-degree systems under disturbances, we further augment the recursive high-order safe-set construction with barrier compensation terms to obtain a high-order BLF and formulate an adversarial disturbance attenuation problem, which is approximately solved via safe-exploration-enhanced online critic learning. Simulations demonstrate reduced local trapping, improved safety--performance trade-offs, and safe operation under disturbances.

CLApr 28, 2025
Towards Long Context Hallucination Detection

Siyi Liu, Kishaloy Halder, Zheng Qi et al.

Large Language Models (LLMs) have demonstrated remarkable performance across various tasks. However, they are prone to contextual hallucination, generating information that is either unsubstantiated or contradictory to the given context. Although many studies have investigated contextual hallucinations in LLMs, addressing them in long-context inputs remains an open problem. In this work, we take an initial step toward solving this problem by constructing a dataset specifically designed for long-context hallucination detection. Furthermore, we propose a novel architecture that enables pre-trained encoder models, such as BERT, to process long contexts and effectively detect contextual hallucinations through a decomposition and aggregation mechanism. Our experimental results show that the proposed architecture significantly outperforms previous models of similar size as well as LLM-based models across various metrics, while providing substantially faster inference.

CLOct 16, 2024
Open Domain Question Answering with Conflicting Contexts

Siyi Liu, Qiang Ning, Kishaloy Halder et al.

Open domain question answering systems frequently rely on information retrieved from large collections of text (such as the Web) to answer questions. However, such collections of text often contain conflicting information, and indiscriminately depending on this information may result in untruthful and inaccurate answers. To understand the gravity of this problem, we collect a human-annotated dataset, Question Answering with Conflicting Contexts (QACC), and find that as much as 25% of unambiguous, open domain questions can lead to conflicting contexts when retrieved using Google Search. We evaluate and benchmark three powerful Large Language Models (LLMs) with our dataset QACC and demonstrate their limitations in effectively addressing questions with conflicting information. To explore how humans reason through conflicting contexts, we request our annotators to provide explanations for their selections of correct answers. We demonstrate that by finetuning LLMs to explain their answers, we can introduce richer information into their training that guide them through the process of reasoning with conflicting contexts.

LGFeb 23
BarrierSteer: LLM Safety via Learning Barrier Steering

Thanh Q. Tran, Arun Verma, Kiwan Wong et al.

Despite the state-of-the-art performance of large language models (LLMs) across diverse tasks, their susceptibility to adversarial attacks and unsafe content generation remains a major obstacle to deployment, particularly in high-stakes settings. Addressing this challenge requires safety mechanisms that are both practically effective and supported by rigorous theory. We introduce BarrierSteer, a novel framework that formalizes response safety by embedding learned non-linear safety constraints directly into the model's latent representation space. BarrierSteer employs a steering mechanism based on Control Barrier Functions (CBFs) to efficiently detect and prevent unsafe response trajectories during inference with high precision. By enforcing multiple safety constraints through efficient constraint merging, without modifying the underlying LLM parameters, BarrierSteer preserves the model's original capabilities and performance. We provide theoretical results establishing that applying CBFs in latent space offers a principled and computationally efficient approach to enforcing safety. Our experiments across multiple models and datasets show that BarrierSteer substantially reduces adversarial success rates, decreases unsafe generations, and outperforms existing methods.

SYJan 26, 2025
Learning-Enhanced Safeguard Control for High-Relative-Degree Systems: Robust Optimization under Disturbances and Faults

Xinyang Wang, Hongwei Zhang, Shimin Wang et al.

Merely pursuing performance may adversely affect the safety, while a conservative policy for safe exploration will degrade the performance. How to balance the safety and performance in learning-based control problems is an interesting yet challenging issue. This paper aims to enhance system performance with safety guarantee in solving the reinforcement learning (RL)-based optimal control problems of nonlinear systems subject to high-relative-degree state constraints and unknown time-varying disturbance/actuator faults. First, to combine control barrier functions (CBFs) with RL, a new type of CBFs, termed high-order reciprocal control barrier function (HO-RCBF) is proposed to deal with high-relative-degree constraints during the learning process. Then, the concept of gradient similarity is proposed to quantify the relationship between the gradient of safety and the gradient of performance. Finally, gradient manipulation and adaptive mechanisms are introduced in the safe RL framework to enhance the performance with a safety guarantee. Two simulation examples illustrate that the proposed safe RL framework can address high-relative-degree constraint, enhance safety robustness and improve system performance.

CLDec 12, 2024
Rethinking LLM Uncertainty: A Multi-Agent Approach to Estimating Black-Box Model Uncertainty

Yu Feng, Phu Mon Htut, Zheng Qi et al.

Quantifying uncertainty in black-box LLMs is vital for reliable responses and scalable oversight. Existing methods, which gauge a model's uncertainty through evaluating self-consistency in responses to the target query, can be misleading: an LLM may confidently provide an incorrect answer to a target query, yet give a confident and accurate answer to that same target query when answering a knowledge-preserving perturbation of the query. We systematically analyze the model behaviors and demonstrate that this discrepancy stems from suboptimal retrieval of parametric knowledge, often due to contextual biases that prevent consistent access to stored knowledge. We then introduce DiverseAgentEntropy, a novel, theoretically-grounded method employing multi-agent interaction across diverse query variations for uncertainty estimation of black-box LLMs. This approach more accurately assesses an LLM's true uncertainty and improves hallucination detection, outperforming existing self-consistency based techniques.

LGMay 22, 2025
Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only

Wei Xiao, Jiacheng Liu, Zifeng Zhuang et al.

Improving the performance of pre-trained policies through online reinforcement learning (RL) is a critical yet challenging topic. Existing online RL fine-tuning methods require continued training with offline pretrained Q-functions for stability and performance. However, these offline pretrained Q-functions commonly underestimate state-action pairs beyond the offline dataset due to the conservatism in most offline RL methods, which hinders further exploration when transitioning from the offline to the online setting. Additionally, this requirement limits their applicability in scenarios where only pre-trained policies are available but pre-trained Q-functions are absent, such as in imitation learning (IL) pre-training. To address these challenges, we propose a method for efficient online RL fine-tuning using solely the offline pre-trained policy, eliminating reliance on pre-trained Q-functions. We introduce PORL (Policy-Only Reinforcement Learning Fine-Tuning), which rapidly initializes the Q-function from scratch during the online phase to avoid detrimental pessimism. Our method not only achieves competitive performance with advanced offline-to-online RL algorithms and online RL approaches that leverage data or policies prior, but also pioneers a new path for directly fine-tuning behavior cloning (BC) policies.

ROMar 7
SSP: Safety-guaranteed Surgical Policy via Joint Optimization of Behavioral and Spatial Constraints

Jianshu Hu, ZhiYuan Guan, Lei Song et al.

The paradigm of robot-assisted surgery is shifting toward data-driven autonomy, where policies learned via Reinforcement Learning (RL) or Imitation Learning (IL) enable the execution of complex tasks. However, these ``black-box" policies often lack formal safety guarantees, a critical requirement for clinical deployment. In this paper, we propose the Safety-guaranteed Surgical Policy (SSP) framework to bridge the gap between data-driven generality and formal safety. We utilize Neural Ordinary Differential Equations (Neural ODEs) to learn an uncertainty-aware dynamics model from demonstration data. This learned model underpins a robust Control Barrier Function (CBF) safety controller, which minimally alters the actions of a surgical policy to ensure strict safety under uncertainty. Our controller enforces two constraint categories: behavioral constraints (restricting the task space of the agent) and spatial constraints (defining surgical no-go zones). We instantiate the SSP framework with surgical policies derived from RL, IL and Control Lyapunov Functions (CLF). Validation on in both the SurRoL simulation and da Vinci Research Kit (dVRK) demonstrates that our method achieves a near-zero constraint violation rate while maintaining high task success rates compared to unconstrained baselines.

LGFeb 1
Mixture-of-World Models: Scaling Multi-Task Reinforcement Learning with Modular Latent Dynamics

Boxuan Zhang, Weipu Zhang, Zhaohan Feng et al.

A fundamental challenge in multi-task reinforcement learning (MTRL) is achieving sample efficiency in visual domains where tasks exhibit substantial heterogeneity in both observations and dynamics. Model-based reinforcement learning offers a promising path to improved sample efficiency through world models, but standard monolithic architectures struggle to capture diverse task dynamics, resulting in poor reconstruction and prediction accuracy. We introduce Mixture-of-World Models (MoW), a scalable architecture that combines modular variational autoencoders for task-adaptive visual compression, a hybrid Transformer-based dynamics model with task-conditioned experts and a shared backbone, and a gradient-based task clustering strategy for efficient parameter allocation. On the Atari 100k benchmark, a single MoW agent trained once on 26 Atari games achieves a mean human-normalized score of 110.4%, competitive with the score of 114.2% achieved by STORM, an ensemble of 26 task-specific models, while using 50% fewer parameters. On Meta-World, MoW achieves a 74.5% average success rate within 300 thousand environment steps, establishing a new state of the art. These results demonstrate that MoW provides a scalable and parameter-efficient foundation for generalist world models.

ROSep 15, 2025
TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning

Jiacheng Liu, Pengxiang Ding, Qihang Zhou et al.

Recent Vision-Language-Action models show potential to generalize across embodiments but struggle to quickly align with a new robot's action space when high-quality demonstrations are scarce, especially for bipedal humanoids. We present TrajBooster, a cross-embodiment framework that leverages abundant wheeled-humanoid data to boost bipedal VLA. Our key idea is to use end-effector trajectories as a morphology-agnostic interface. TrajBooster (i) extracts 6D dual-arm end-effector trajectories from real-world wheeled humanoids, (ii) retargets them in simulation to Unitree G1 with a whole-body controller trained via a heuristic-enhanced harmonized online DAgger to lift low-dimensional trajectory references into feasible high-dimensional whole-body actions, and (iii) forms heterogeneous triplets that couple source vision/language with target humanoid-compatible actions to post-pre-train a VLA, followed by only 10 minutes of teleoperation data collection on the target humanoid domain. Deployed on Unitree G1, our policy achieves beyond-tabletop household tasks, enabling squatting, cross-height manipulation, and coordinated whole-body motion with markedly improved robustness and generalization. Results show that TrajBooster allows existing wheeled-humanoid data to efficiently strengthen bipedal humanoid VLA performance, reducing reliance on costly same-embodiment data while enhancing action space understanding and zero-shot skill transfer capabilities. For more details, For more details, please refer to our \href{https://jiachengliu3.github.io/TrajBooster/}.

ROAug 28, 2025
SPGrasp: Spatiotemporal Prompt-driven Grasp Synthesis in Dynamic Scenes

Yunpeng Mei, Hongjie Cao, Yinqiu Xia et al.

Real-time interactive grasp synthesis for dynamic objects remains challenging as existing methods fail to achieve low-latency inference while maintaining promptability. To bridge this gap, we propose SPGrasp (spatiotemporal prompt-driven dynamic grasp synthesis), a novel framework extending segment anything model v2 (SAMv2) for video stream grasp estimation. Our core innovation integrates user prompts with spatiotemporal context, enabling real-time interaction with end-to-end latency as low as 59 ms while ensuring temporal consistency for dynamic objects. In benchmark evaluations, SPGrasp achieves instance-level grasp accuracies of 90.6% on OCID and 93.8% on Jacquard. On the challenging GraspNet-1Billion dataset under continuous tracking, SPGrasp achieves 92.0% accuracy with 73.1 ms per-frame latency, representing a 58.5% reduction compared to the prior state-of-the-art promptable method RoG-SAM while maintaining competitive accuracy. Real-world experiments involving 13 moving objects demonstrate a 94.8% success rate in interactive grasping scenarios. These results confirm SPGrasp effectively resolves the latency-interactivity trade-off in dynamic grasp synthesis.

IRFeb 12, 2025
SS4Rec: Continuous-Time Sequential Recommendation with State Space Models

Wei Xiao, Huiying Wang, Qifeng Zhou

Sequential recommendation is a key area in the field of recommendation systems aiming to model user interest based on historical interaction sequences with irregular intervals. While previous recurrent neural network-based and attention-based approaches have achieved significant results, they have limitations in capturing system continuity due to the discrete characteristics. In the context of continuous-time modeling, state space model (SSM) offers a potential solution, as it can effectively capture the dynamic evolution of user interest over time. However, existing SSM-based approaches ignore the impact of irregular time intervals within historical user interactions, making it difficult to model complexed user-item transitions in sequences. To address this issue, we propose a hybrid SSM-based model called SS4Rec for continuous-time sequential recommendation. SS4Rec integrates a time-aware SSM to handle irregular time intervals and a relation-aware SSM to model contextual dependencies, enabling it to infer user interest from both temporal and sequential perspectives. In the training process, the time-aware SSM and the relation-aware SSM are discretized by variable stepsizes according to user interaction time intervals and input data, respectively. This helps capture the continuous dependency from irregular time intervals and provides time-specific personalized recommendations. Experimental studies on five benchmark datasets demonstrate the superiority and effectiveness of SS4Rec.

LGJun 18, 2024
ABNet: Attention BarrierNet for Safe and Scalable Robot Learning

Wei Xiao, Tsun-Hsuan Wang, Daniela Rus

Safe learning is central to AI-enabled robots where a single failure may lead to catastrophic results. Barrier-based method is one of the dominant approaches for safe robot learning. However, this method is not scalable, hard to train, and tends to generate unstable signals under noisy inputs that are challenging to be deployed for robots. To address these challenges, we propose a novel Attention BarrierNet (ABNet) that is scalable to build larger foundational safe models in an incremental manner. Each head of BarrierNet in the ABNet could learn safe robot control policies from different features and focus on specific part of the observation. In this way, we do not need to one-shotly construct a large model for complex tasks, which significantly facilitates the training of the model while ensuring its stable output. Most importantly, we can still formally prove the safety guarantees of the ABNet. We demonstrate the strength of ABNet in 2D robot obstacle avoidance, safe robot manipulation, and vision-based end-to-end autonomous driving, with results showing much better robustness and guarantees over existing models.

LGMay 31, 2023
SafeDiffuser: Safe Planning with Diffusion Probabilistic Models

Wei Xiao, Tsun-Hsuan Wang, Chuang Gan et al.

Diffusion model-based approaches have shown promise in data-driven planning, but there are no safety guarantees, thus making it hard to be applied for safety-critical applications. To address these challenges, we propose a new method, called SafeDiffuser, to ensure diffusion probabilistic models satisfy specifications by using a class of control barrier functions. The key idea of our approach is to embed the proposed finite-time diffusion invariance into the denoising diffusion procedure, which enables trustworthy diffusion data generation. Moreover, we demonstrate that our finite-time diffusion invariance method through generative models not only maintains generalization performance but also creates robustness in safe data generation. We test our method on a series of safe planning tasks, including maze path generation, legged robot locomotion, and 3D space manipulation, with results showing the advantages of robustness and guarantees over vanilla diffusion models.

MAMay 26, 2023
Trust-Aware Resilient Control and Coordination of Connected and Automated Vehicles

H M Sabbir Ahmad, Ehsan Sabouni, Wei Xiao et al.

We address the security of a network of Connected and Automated Vehicles (CAVs) cooperating to navigate through a conflict area. Adversarial attacks such as Sybil attacks can cause safety violations resulting in collisions and traffic jams. In addition, uncooperative (but not necessarily adversarial) CAVs can also induce similar adversarial effects on the traffic network. We propose a decentralized resilient control and coordination scheme that mitigates the effects of adversarial attacks and uncooperative CAVs by utilizing a trust framework. Our trust-aware scheme can guarantee safe collision free coordination and mitigate traffic jams. Simulation results validate the theoretical guarantee of our proposed scheme, and demonstrate that it can effectively mitigate adversarial effects across different traffic scenarios.

LGNov 22, 2021
BarrierNet: A Safety-Guaranteed Layer for Neural Networks

Wei Xiao, Ramin Hasani, Xiao Li et al.

This paper introduces differentiable higher-order control barrier functions (CBF) that are end-to-end trainable together with learning systems. CBFs are usually overly conservative, while guaranteeing safety. Here, we address their conservativeness by softening their definitions using environmental dependencies without loosing safety guarantees, and embed them into differentiable quadratic programs. These novel safety layers, termed a BarrierNet, can be used in conjunction with any neural network-based controller, and can be trained by gradient descent. BarrierNet allows the safety constraints of a neural controller be adaptable to changing environments. We evaluate them on a series of control problems such as traffic merging and robot navigations in 2D and 3D space, and demonstrate their effectiveness compared to state-of-the-art approaches.

LGOct 19, 2021
Two-stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant

Wei Xiao, Qian Hu, Thahir Mohamed et al.

Intelligent personal assistants (IPA) enable voice applications that facilitate people's daily tasks. However, due to the complexity and ambiguity of voice requests, some requests may not be handled properly by the standard natural language understanding (NLU) component. In such cases, a simple reply like "Sorry, I don't know" hurts the user's experience and limits the functionality of IPA. In this paper, we propose a two-stage shortlister-reranker recommender system to match third-party voice applications (skills) to unhandled utterances. In this approach, a skill shortlister is proposed to retrieve candidate skills from the skill catalog by calculating both lexical and semantic similarity between skills and user requests. We also illustrate how to build a new system by using observed data collected from a baseline rule-based system, and how the exposure biases can generate discrepancy between offline and human metrics. Lastly, we present two relabeling methods that can handle the incomplete ground truth, and mitigate exposure bias. We demonstrate the effectiveness of our proposed system through extensive offline experiments. Furthermore, we present online A/B testing results that show a significant boost on user experience satisfaction.

CLOct 16, 2021
Virtual Augmentation Supported Contrastive Learning of Sentence Representations

Dejiao Zhang, Wei Xiao, Henghui Zhu et al.

Despite profound successes, contrastive representation learning relies on carefully designed data augmentations using domain specific knowledge. This challenge is magnified in natural language processing where no general rules exist for data augmentation due to the discrete nature of natural language. We tackle this challenge by presenting a Virtual augmentation Supported Contrastive Learning of sentence representations (VaSCL). Originating from the interpretation that data augmentation essentially constructs the neighborhoods of each training instance, we in turn utilize the neighborhood to generate effective data augmentations. Leveraging the large training batch size of contrastive learning, we approximate the neighborhood of an instance via its K-nearest in-batch neighbors in the representation space. We then define an instance discrimination task regarding this neighborhood and generate the virtual augmentation in an adversarial training manner. We access the performance of VaSCL on a wide range of downstream tasks, and set a new state-of-the-art for unsupervised sentence representation learning.

CLOct 16, 2021
Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora

Xisen Jin, Dejiao Zhang, Henghui Zhu et al.

Pretrained language models (PTLMs) are typically learned over a large, static corpus and further fine-tuned for various downstream tasks. However, when deployed in the real world, a PTLM-based model must deal with data distributions that deviate from what the PTLM was initially trained on. In this paper, we study a lifelong language model pretraining challenge where a PTLM is continually updated so as to adapt to emerging data. Over a domain-incremental research paper stream and a chronologically-ordered tweet stream, we incrementally pretrain a PTLM with different continual learning algorithms, and keep track of the downstream task performance (after fine-tuning). We evaluate PTLM's ability to adapt to new corpora while retaining learned knowledge in earlier corpora. Our experiments show distillation-based approaches to be most effective in retaining downstream performance in earlier domains. The algorithms also improve knowledge transfer, allowing models to achieve better downstream performance over the latest data, and improve temporal generalization when distribution gaps exist between training and evaluation because of time. We believe our problem formulation, methods, and analysis will inspire future studies towards continual pretraining of language models.

CLSep 12, 2021
Pairwise Supervised Contrastive Learning of Sentence Representations

Dejiao Zhang, Shang-Wen Li, Wei Xiao et al.

Many recent successes in sentence representation learning have been achieved by simply fine-tuning on the Natural Language Inference (NLI) datasets with triplet loss or siamese loss. Nevertheless, they share a common weakness: sentences in a contradiction pair are not necessarily from different semantic categories. Therefore, optimizing the semantic entailment and contradiction reasoning objective alone is inadequate to capture the high-level semantic structure. The drawback is compounded by the fact that the vanilla siamese or triplet losses only learn from individual sentence pairs or triplets, which often suffer from bad local optima. In this paper, we propose PairSupCon, an instance discrimination based approach aiming to bridge semantic entailment and contradiction understanding with high-level categorical concept encoding. We evaluate PairSupCon on various downstream tasks that involve understanding sentence semantics at different granularities. We outperform the previous state-of-the-art method with $10\%$--$13\%$ averaged improvement on eight clustering tasks, and $5\%$--$6\%$ averaged improvement on seven semantic textual similarity (STS) tasks.

ROJul 15, 2021
Rule-based Evaluation and Optimal Control for Autonomous Driving

Wei Xiao, Noushin Mehdipour, Anne Collin et al.

We develop optimal control strategies for autonomous vehicles (AVs) that are required to meet complex specifications imposed as rules of the road (ROTR) and locally specific cultural expectations of reasonable driving behavior. We formulate these specifications as rules, and specify their priorities by constructing a priority structure, called \underline{T}otal \underline{OR}der over e\underline{Q}uivalence classes (TORQ). We propose a recursive framework, in which the satisfaction of the rules in the priority structure are iteratively relaxed in reverse order of priority. Central to this framework is an optimal control problem, where convergence to desired states is achieved using Control Lyapunov Functions (CLFs) and clearance with other road users is enforced through Control Barrier Functions (CBFs). We present offline and online approaches to this problem. In the latter, the AV has limited sensing range that affects the activation of the rules, and the control is generated using a receding horizon (Model Predictive Control, MPC) approach. We also show how the offline method can be used for after-the-fact (offline) pass/fail evaluation of trajectories - a given trajectory is rejected if we can find a controller producing a trajectory that leads to less violation of the rule priority structure. We present case studies with multiple driving scenarios to demonstrate the effectiveness of the algorithms, and to compare the offline and online versions of our proposed framework.

SYMar 29, 2021
Event-Triggered Safety-Critical Control for Systems with Unknown Dynamics

Wei Xiao, Calin Belta, Christos G. Cassandras

This paper addresses the problem of safety-critical control for systems with unknown dynamics. It has been shown that stabilizing affine control systems to desired (sets of) states while optimizing quadratic costs subject to state and control constraints can be reduced to a sequence of quadratic programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). Our recently proposed High Order CBFs (HOCBFs) can accommodate constraints of arbitrary relative degree. One of the main challenges in this approach is obtaining accurate system dynamics, which is especially difficult for systems that require online model identification given limited computational resources and system data. In order to approximate the real unmodelled system dynamics, we define adaptive affine control dynamics which are updated based on the error states obtained by real-time sensor measurements. We define a HOCBF for a safety requirement on the unmodelled system based on the adaptive dynamics and error states, and reformulate the safety-critical control problem as the above mentioned QP. Then, we determine the events required to solve the QP in order to guarantee safety. We also derive a condition that guarantees the satisfaction of the HOCBF constraint between events. We illustrate the effectiveness of the proposed framework on an adaptive cruise control problem and compare it with the classical time-driven approach.

SYFeb 12, 2021
High Order Control Lyapunov-Barrier Functions for Temporal Logic Specifications

Wei Xiao, Calin A. Belta, Christos G. Cassandras

Recent work has shown that stabilizing an affine control system to a desired state while optimizing a quadratic cost subject to state and control constraints can be reduced to a sequence of Quadratic Programs (QPs) by using Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs). In our own recent work, we defined High Order CBFs (HOCBFs) for systems and constraints with arbitrary relative degrees. In this paper, in order to accommodate initial states that do not satisfy the state constraints and constraints with arbitrary relative degree, we generalize HOCBFs to High Order Control Lyapunov-Barrier Functions (HOCLBFs). We also show that the proposed HOCLBFs can be used to guarantee the Boolean satisfaction of Signal Temporal Logic (STL) formulae over the state of the system. We illustrate our approach on a safety-critical optimal control problem (OCP) for a unicycle.