Yongqiang Li

CL
h-index16
4papers
16citations
Novelty56%
AI Score37

4 Papers

CLJan 20, 2024Code
Orion-14B: Open-source Multilingual Large Language Models

Du Chen, Yi Huang, Xiaopu Li et al.

In this study, we introduce Orion-14B, a collection of multilingual large language models with 14 billion parameters. We utilize a data scheduling approach to train a foundational model on a diverse corpus of 2.5 trillion tokens, sourced from texts in English, Chinese, Japanese, Korean, and other languages. Additionally, we fine-tuned a series of models tailored for conversational applications and other specific use cases. Our evaluation results demonstrate that Orion-14B achieves state-of-the-art performance across a broad spectrum of tasks. We make the Orion-14B model family and its associated code publicly accessible https://github.com/OrionStarAI/Orion, aiming to inspire future research and practical applications in the field.

ROJan 22, 2024
Efficient and Generalized end-to-end Autonomous Driving System with Latent Deep Reinforcement Learning and Demonstrations

Zuojin Tang, Xiaoyu Chen, Yongqiang Li et al.

An intelligent driving system should dynamically formulate appropriate driving strategies based on the current environment and vehicle status while ensuring system security and reliability. However, methods based on reinforcement learning and imitation learning often suffer from high sample complexity, poor generalization, and low safety. To address these challenges, this paper introduces an efficient and generalized end-to-end autonomous driving system (EGADS) for complex and varied scenarios. The RL agent in our EGADS combines variational inference with normalizing flows, which are independent of distribution assumptions. This combination allows the agent to capture historical information relevant to driving in latent space effectively, thereby significantly reducing sample complexity. Additionally, we enhance safety by formulating robust safety constraints and improve generalization and performance by integrating RL with expert demonstrations. Experimental results demonstrate that, compared to existing methods, EGADS significantly reduces sample complexity, greatly improves safety performance, and exhibits strong generalization capabilities in complex urban scenarios. Particularly, we contributed an expert dataset collected through human expert steering wheel control, specifically using the G29 steering wheel.

LGOct 11, 2025
Homomorphic Mappings for Value-Preserving State Aggregation in Markov Decision Processes

Shuo Zhao, Yongqiang Li, Yu Feng et al.

State aggregation aims to reduce the computational complexity of solving Markov Decision Processes (MDPs) while preserving the performance of the original system. A fundamental challenge lies in optimizing policies within the aggregated, or abstract, space such that the performance remains optimal in the ground MDP-a property referred to as {"}optimal policy equivalence {"}. This paper presents an abstraction framework based on the notion of homomorphism, in which two Markov chains are deemed homomorphic if their value functions exhibit a linear relationship. Within this theoretical framework, we establish a sufficient condition for the equivalence of optimal policy. We further examine scenarios where the sufficient condition is not met and derive an upper bound on the approximation error and a performance lower bound for the objective function under the ground MDP. We propose Homomorphic Policy Gradient (HPG), which guarantees optimal policy equivalence under sufficient conditions, and its extension, Error-Bounded HPG (EBHPG), which balances computational efficiency and the performance loss induced by aggregation. In the experiments, we validated the theoretical results and conducted comparative evaluations against seven algorithms.

CVDec 28, 2024
Multi-Modality Driven LoRA for Adverse Condition Depth Estimation

Guanglei Yang, Rui Tian, Yongqiang Zhang et al.

The autonomous driving community is increasingly focused on addressing corner case problems, particularly those related to ensuring driving safety under adverse conditions (e.g., nighttime, fog, rain). To this end, the task of Adverse Condition Depth Estimation (ACDE) has gained significant attention. Previous approaches in ACDE have primarily relied on generative models, which necessitate additional target images to convert the sunny condition into adverse weather, or learnable parameters for feature augmentation to adapt domain gaps, resulting in increased model complexity and tuning efforts. Furthermore, unlike CLIP-based methods where textual and visual features have been pre-aligned, depth estimation models lack sufficient alignment between multimodal features, hindering coherent understanding under adverse conditions. To address these limitations, we propose Multi-Modality Driven LoRA (MMD-LoRA), which leverages low-rank adaptation matrices for efficient fine-tuning from source-domain to target-domain. It consists of two core components: Prompt Driven Domain Alignment (PDDA) and Visual-Text Consistent Contrastive Learning(VTCCL). During PDDA, the image encoder with MMD-LoRA generates target-domain visual representations, supervised by alignment loss that the source-target difference between language and image should be equal. Meanwhile, VTCCL bridges the gap between textual features from CLIP and visual features from diffusion model, pushing apart different weather representations (vision and text) and bringing together similar ones. Through extensive experiments, the proposed method achieves state-of-the-art performance on the nuScenes and Oxford RobotCar datasets, underscoring robustness and efficiency in adapting to varied adverse environments.