Long Chen

h-index54

14papers

11,525citations

Novelty43%

AI Score50

Ranked #20,740 of 194,257 authors (top 11%)#7,488 in CV (top 13%)

14 Papers

11.5HCJul 8

Clinical Translation of Brain-Computer Interface in China: A Landscape Analysis of Investigator-Initiated Trials, Registered Clinical Trials, and Regulatory Approval

Long Chen, Wanyi Qing, Lifen Mo et al.

Neurological injury affects hundreds of millions of people worldwide, yet the loss of motor or communication functions resulting from stroke, spinal cord injury, and neurodegenerative disease remains largely irreversible with existing therapies. Brain-computer interfaces (BCIs) offer a promising pathway for restoring these functions by decoding neural activity into commands that control an external device. Here, we present the first quantitative analysis of China's BCI translational ecosystem, integrating evidence from three pillars: investigator-initiated trials (IITs), registered clinical trials, and regulatory-approved products. We analyzed 134 clinical trials from the Chinese Clinical Trial Registry (ChiCTR), 26 IITs, and five BCI-related products approved by the National Medical Products Administration as of June 2026. Results demonstrate that clinical trial registration has increased rapidly since 2020, with research centers concentrated primarily in Guangdong, Shanghai, and Jiangsu. Non-invasive systems predominated, accounting for 79.1% of registered studies, with stroke rehabilitation as the leading indication (65.0%). As of June 2026, five BCI-related products received regulatory approvals, including the world's first approved semi-invasive implantable BCI, an invasive closed-loop deep brain stimulation system with real-time local field potential recording, and three non-invasive EEG-based rehabilitation systems. Collectively, these findings characterize a rapidly expanding BCI translational pipeline in China, spanning from early clinical research to regulatory approval. However, long-term implant stability, standardization of clinical infrastructure and workflows, and generalizability of decoding algorithms remain critical barriers to widespread clinical adoption. Addressing these challenges will be essential for integrating BCI technologies into routine clinical practice.

1.5CVJan 23Code

Expert Knowledge-Guided Decision Calibration for Accurate Fine-Grained Tree Species Classification

Chen Long, Dian Chen, Ruifei Ding et al.

Accurate fine-grained tree species classification is critical for forest inventory and biodiversity monitoring. Existing methods predominantly focus on designing complex architectures to fit local data distributions. However, they often overlook the long-tailed distributions and high inter-class similarity inherent in limited data, thereby struggling to distinguish between few-shot or confusing categories. In the process of knowledge dissemination in the human world, individuals will actively seek expert assistance to transcend the limitations of local thinking. Inspired by this, we introduce an external "Domain Expert" and propose an Expert Knowledge-Guided Classification Decision Calibration Network (EKDC-Net) to overcome these challenges. Our framework addresses two core issues: expert knowledge extraction and utilization. Specifically, we first develop a Local Prior Guided Knowledge Extraction Module (LPKEM). By leveraging Class Activation Map (CAM) analysis, LPKEM guides the domain expert to focus exclusively on discriminative features essential for classification. Subsequently, to effectively integrate this knowledge, we design an Uncertainty-Guided Decision Calibration Module (UDCM). This module dynamically corrects the local model's decisions by considering both overall category uncertainty and instance-level prediction uncertainty. Furthermore, we present a large-scale classification dataset covering 102 tree species, named CU-Tree102 to address the issue of scarce diversity in current benchmarks. Experiments on three benchmark datasets demonstrate that our approach achieves state-of-the-art performance. Crucially, as a lightweight plug-and-play module, EKDC-Net improves backbone accuracy by 6.42% and precision by 11.46% using only 0.08M additional learnable parameters. The dataset, code, and pre-trained models are available at https://github.com/WHU-USI3DV/TreeCLS.

1.2NAApr 14, 2018

Programming of Finite Element Methods in MATLAB

Long Chen

We discuss how to implement the linear finite element method for solving the Poisson equation. We begin with the data structure to represent the triangulation and boundary conditions, introduce the sparse matrix, and then discuss the assembling process. We pay special attention to an efficient programming style using sparse matrices in MATLAB.

1.2NAApr 12, 2018

Convergence Analysis for A Class of Iterative Methods for Solving Saddle Point Systems

Long Chen, Yongke Wu

Convergence analysis of a nested iterative scheme proposed by Bank,Welfert and Yserentant (BWY) ([Numer. Math., 666: 645-666, 1990]) for solving saddle point system is presented. It is shown that this scheme converges under weaker conditions: the contraction rate for solving the $(1,1)$ block matrix is bound by $(\sqrt{5}-1)/2$. Similar convergence result is also obtained for a class of inexact Uzawa method with even weaker contraction bound $\sqrt{2}/2$. Preconditioned generalized minimal residual method using BWY method as a preconditioner is shown to converge with realistic assumptions.

8.5AIDec 24, 2024Code

Property Enhanced Instruction Tuning for Multi-task Molecule Generation with Large Language Models

Xuan Lin, Long Chen, Yile Wang et al.

Large language models (LLMs) are widely applied in various natural language processing tasks such as question answering and machine translation. However, due to the lack of labeled data and the difficulty of manual annotation for biochemical properties, the performance for molecule generation tasks is still limited, especially for tasks involving multi-properties constraints. In this work, we present a two-step framework PEIT (Property Enhanced Instruction Tuning) to improve LLMs for molecular-related tasks. In the first step, we use textual descriptions, SMILES, and biochemical properties as multimodal inputs to pre-train a model called PEIT-GEN, by aligning multi-modal representations to synthesize instruction data. In the second step, we fine-tune existing open-source LLMs with the synthesized data, the resulting PEIT-LLM can handle molecule captioning, text-based molecule generation, molecular property prediction, and our newly proposed multi-constraint molecule generation tasks. Experimental results show that our pre-trained PEIT-GEN outperforms MolT5 and BioT5 in molecule captioning, demonstrating modalities align well between textual descriptions, structures, and biochemical properties. Furthermore, PEIT-LLM shows promising improvements in multi-task molecule generation, proving the scalability of the PEIT framework for various molecular tasks. We release the code, constructed instruction data, and model checkpoints in https://github.com/chenlong164/PEIT.

42.2CVJun 9, 2025

ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving

Yongkang Li, Kaixin Xiong, Xiangyu Guo et al.

Recent studies have explored leveraging the world knowledge and cognitive capabilities of Vision-Language Models (VLMs) to address the long-tail problem in end-to-end autonomous driving. However, existing methods typically formulate trajectory planning as a language modeling task, where physical actions are output in the language space, potentially leading to issues such as format-violating outputs, infeasible actions, and slow inference speeds. In this paper, we propose ReCogDrive, a novel Reinforced Cognitive framework for end-to-end autonomous Driving, unifying driving understanding and planning by integrating an autoregressive model with a diffusion planner. First, to instill human driving cognition into the VLM, we introduce a hierarchical data pipeline that mimics the sequential cognitive process of human drivers through three stages: generation, refinement, and quality control. Building on this cognitive foundation, we then address the language-action mismatch by injecting the VLM's learned driving priors into a diffusion planner to efficiently generate continuous and stable trajectories. Furthermore, to enhance driving safety and reduce collisions, we introduce a Diffusion Group Relative Policy Optimization (DiffGRPO) stage, reinforcing the planner for enhanced safety and comfort. Extensive experiments on the NAVSIM and Bench2Drive benchmarks demonstrate that ReCogDrive achieves state-of-the-art performance. Additionally, qualitative results across diverse driving scenarios and DriveBench highlight the model's scene comprehension. All code, model weights, and datasets will be made publicly available to facilitate subsequent research.

15.5CVSep 3, 2025

VQualA 2025 Challenge on Engagement Prediction for Short Videos: Methods and Results

Dasong Li, Sizhuo Ma, Hang Hua et al.

This paper presents an overview of the VQualA 2025 Challenge on Engagement Prediction for Short Videos, held in conjunction with ICCV 2025. The challenge focuses on understanding and modeling the popularity of user-generated content (UGC) short videos on social media platforms. To support this goal, the challenge uses a new short-form UGC dataset featuring engagement metrics derived from real-world user interactions. This objective of the Challenge is to promote robust modeling strategies that capture the complex factors influencing user engagement. Participants explored a variety of multi-modal features, including visual content, audio, and metadata provided by creators. The challenge attracted 97 participants and received 15 valid test submissions, contributing significantly to progress in short-form UGC video engagement prediction.

6.1CLNov 13, 2024

RED: Unleashing Token-Level Rewards from Holistic Feedback via Reward Redistribution

Jiahui Li, Lin Li, Tai-wei Chang et al.

Reinforcement learning from human feedback (RLHF) offers a promising approach to aligning large language models (LLMs) with human preferences. Typically, a reward model is trained or supplied to act as a proxy for humans in evaluating generated responses during the reinforcement training phase. However, current reward models operate as sequence-to-one models, allocating a single, sparse, and delayed reward to an entire output sequence. This approach may overlook the significant contributions of individual tokens toward the desired outcome. To this end, we propose a more fine-grained, token-level guidance approach for RL training. Specifically, we introduce RED, a novel reward redistribition method that evaluates and assigns specific credit to each token using an off-the-shelf reward model. Utilizing these fine-grained rewards enhances the model's understanding of language nuances, leading to more precise performance improvements. Notably, our method does not require modifying the reward model or introducing additional training steps, thereby incurring minimal computational costs. Experimental results across diverse datasets and tasks demonstrate the superiority of our approach.

7.9LGDec 18, 2024

Learning Causal Transition Matrix for Instance-dependent Label Noise

Jiahui Li, Tai-Wei Chang, Kun Kuang et al.

Noisy labels are both inevitable and problematic in machine learning methods, as they negatively impact models' generalization ability by causing overfitting. In the context of learning with noise, the transition matrix plays a crucial role in the design of statistically consistent algorithms. However, the transition matrix is often considered unidentifiable. One strand of methods typically addresses this problem by assuming that the transition matrix is instance-independent; that is, the probability of mislabeling a particular instance is not influenced by its characteristics or attributes. This assumption is clearly invalid in complex real-world scenarios. To better understand the transition relationship and relax this assumption, we propose to study the data generation process of noisy labels from a causal perspective. We discover that an unobservable latent variable can affect either the instance itself, the label annotation procedure, or both, which complicates the identification of the transition matrix. To address various scenarios, we have unified these observations within a new causal graph. In this graph, the input instance is divided into a noise-resistant component and a noise-sensitive component based on whether they are affected by the latent variable. These two components contribute to identifying the ``causal transition matrix'', which approximates the true transition matrix with theoretical guarantee. In line with this, we have designed a novel training framework that explicitly models this causal relationship and, as a result, achieves a more accurate model for inferring the clean label.

10.2CVOct 19, 2025

Segmentation as A Plug-and-Play Capability for Frozen Multimodal LLMs

Jiazhen Liu, Long Chen

Integrating diverse visual capabilities into a unified model is a significant trend in Multimodal Large Language Models (MLLMs). Among these, the inclusion of segmentation poses a distinct set of challenges. To equip MLLMs with pixel-level segmentation abilities, prevailing methods require finetuning the model to produce specific outputs compatible with a mask decoder. This process typically alters the model's output space and compromises its intrinsic generalization, which undermines the goal of building a unified model. We introduce LENS (Leveraging kEypoiNts for MLLMs' Segmentation), a novel plug-and-play solution. LENS attaches a lightweight, trainable head to a completely frozen MLLM. By refining the spatial cues embedded in attention maps, LENS extracts keypoints and describes them into point-wise features directly compatible with the mask decoder. Extensive experiments validate our approach: LENS achieves segmentation performance competitive with or superior to that of retraining-based methods. Crucially, it does so while fully preserving the MLLM's generalization capabilities, which are significantly degraded by finetuning approaches. As such, the attachable design of LENS establishes an efficient and powerful paradigm for extending MLLMs, paving the way for truly multi-talented, unified models.

13.0LGJul 1, 2025

Foundation Models for Clinical Records at Health System Scale

Haresh Rengaraj Rajamohan, Xiang Gao, Weicheng Zhu et al.

Large-scale pretraining has transformed modeling of language and other data types, but its potential remains underexplored in healthcare with structured electronic health records (EHRs). We present a novel generative pretraining strategy for sequential EHR data using next-visit event prediction. Our model learns to autoregressively generate various tokenized clinical events for the next visit based on patient history and inherently handles the joint prediction of heterogeneous data types. Additionally, we introduce regularization on predicting repeated events and highlight a key pitfall in EHR-based foundation model evaluations: repeated event tokens can inflate performance metrics when new onsets are not distinguished from subsequent occurrences. Our model is evaluated via zero-shot prediction for forecasting dementia and knee osteoarthritis incidence within 2 and 5 years, and the model performance rivals a fully fine-tuned masked pretrained Transformer baseline, demonstrating that our approach captures complex clinical dependencies without requiring costly task-specific fine-tuning.

11.1HCJul 27, 2020

The Adaptability and Challenges of Autonomous Vehicles to Pedestrians in Urban China

Ke Wang, Gang Li, Junlan Chen et al.

China is the world's largest automotive market and is ambitious for autonomous vehicles (AVs) development. As one of the key goals of AVs, pedestrian safety is an important issue in China. Despite the rapid development of driverless technologies in recent years, there is a lack of researches on the adaptability of AVs to pedestrians. To fill the gap, this study would discuss the adaptability of current driverless technologies to China urban pedestrians by reviewing the latest researches. The paper firstly analyzed typical Chinese pedestrian behaviors and summarized the safety demands of pedestrians for AVs through articles and open database data, which are worked as the evaluation criteria. Then, corresponding driverless technologies are carefully reviewed. Finally, the adaptability would be given combining the above analyses. Our review found that autonomous vehicles have trouble in the occluded pedestrian environment and Chinese pedestrians do not accept AVs well. And more explorations should be conducted on standard human-machine interaction, interaction information overload avoidance, occluded pedestrians detection and nation-based receptivity research. The conclusions are very useful for motor corporations and driverless car researchers to place more attention on the complexity of the Chinese pedestrian environment, for transportation experts to protect pedestrian safety in the context of AVs, and for governors to think about making new pedestrians policies to welcome the upcoming driverless cars.

8.1CVSep 20, 2019

Learning Lightweight Pedestrian Detector with Hierarchical Knowledge Distillation

Rui Chen, Haizhou Ai, Chong Shang et al.

It remains very challenging to build a pedestrian detection system for real world applications, which demand for both accuracy and speed. This work presents a novel hierarchical knowledge distillation framework to learn a lightweight pedestrian detector, which significantly reduces the computational cost and still holds the high accuracy at the same time. Following the `teacher--student' diagram that a stronger, deeper neural network can teach a lightweight network to learn better representations, we explore multiple knowledge distillation architectures and reframe this approach as a unified, hierarchical distillation framework. In particular, the proposed distillation is performed at multiple hierarchies, multiple stages in a modern detector, which empowers the student detector to learn both low-level details and high-level abstractions simultaneously. Experiment result shows that a student model trained by our framework, with 6 times compression in number of parameters, still achieves competitive performance as the teacher model on the widely used pedestrian detection benchmark.

5.2CVMar 14, 2018

Self-Supervised Monocular Image Depth Learning and Confidence Estimation

Long Chen, Wen Tang, Nigel John

Convolutional Neural Networks (CNNs) need large amounts of data with ground truth annotation, which is a challenging problem that has limited the development and fast deployment of CNNs for many computer vision tasks. We propose a novel framework for depth estimation from monocular images with corresponding confidence in a self-supervised manner. A fully differential patch-based cost function is proposed by using the Zero-Mean Normalized Cross Correlation (ZNCC) that takes multi-scale patches as a matching strategy. This approach greatly increases the accuracy and robustness of the depth learning. In addition, the proposed patch-based cost function can provide a 0 to 1 confidence, which is then used to supervise the training of a parallel network for confidence map learning and estimation. Evaluation on KITTI dataset shows that our method outperforms the state-of-the-art results.