Jianhua Tang

LG
h-index8
9papers
33citations
Novelty46%
AI Score53

9 Papers

ROAug 19, 2024
Edge-Cloud Collaborative Motion Planning for Autonomous Driving with Large Language Models

Jiao Chen, Suyan Dai, Fangfang Chen et al.

Integrating large language models (LLMs) into autonomous driving enhances personalization and adaptability in open-world scenarios. However, traditional edge computing models still face significant challenges in processing complex driving data, particularly regarding real-time performance and system efficiency. To address these challenges, this study introduces EC-Drive, a novel edge-cloud collaborative autonomous driving system with data drift detection capabilities. EC-Drive utilizes drift detection algorithms to selectively upload critical data, including new obstacles and traffic pattern changes, to the cloud for processing by GPT-4, while routine data is efficiently managed by smaller LLMs on edge devices. This approach not only reduces inference latency but also improves system efficiency by optimizing communication resource use. Experimental validation confirms the system's robust processing capabilities and practical applicability in real-world driving conditions, demonstrating the effectiveness of this edge-cloud collaboration framework. Our data and system demonstration will be released at https://sites.google.com/view/ec-drive.

LGSep 2, 2024
Towards General Industrial Intelligence: A Survey of Continual Large Models in Industrial IoT

Jiao Chen, Jiayi He, Fangfang Chen et al.

Industrial AI is transitioning from traditional deep learning models to large-scale transformer-based architectures, with the Industrial Internet of Things (IIoT) playing a pivotal role. IIoT evolves from a simple data pipeline to an intelligent infrastructure, enabling and enhancing these advanced AI systems. This survey explores the integration of IIoT with large models (LMs) and their potential applications in industrial environments. We focus on four primary types of industrial LMs: language-based, vision-based, time-series, and multimodal models. The lifecycle of LMs is segmented into four critical phases: data foundation, model training, model connectivity, and continuous evolution. First, we analyze how IIoT provides abundant and diverse data resources, supporting the training and fine-tuning of LMs. Second, we discuss how IIoT offers an efficient training infrastructure in low-latency and bandwidth-optimized environments. Third, we highlight the deployment advantages of LMs within IIoT, emphasizing IIoT's role as a connectivity nexus fostering emergent intelligence through modular design, dynamic routing, and model merging to enhance system scalability and adaptability. Finally, we demonstrate how IIoT supports continual learning mechanisms, enabling LMs to adapt to dynamic industrial conditions and ensure long-term effectiveness. This paper underscores IIoT's critical role in the evolution of industrial intelligence with large models, offering a theoretical framework and actionable insights for future research.

NIApr 29Code
SWE-Bench 5G: Benchmarking AI Coding Agents on Telecom Network Engineering Tasks

Jiao Chen, Jianhua Tang, Xiaotong Yang et al.

AI coding agents demonstrate strong performance on general-purpose software benchmarks. However, their ability to handle 5G network engineering tasks remains unexplored. We propose SWE-Bench~5G, the first benchmark designed to investigate whether AI coding agents can resolve real-world bugs in 5G core network software. The benchmark collects task instances from three open-source 5G projects, packages each as a self-contained Docker environment with automated fail-to-pass tests, and provides a dual test strategy tailored to the complex runtime dependencies of telecom code. In addition, for instances whose original issues reference 3GPP specification clauses, we construct concise specification context documents, enabling controlled evaluation of whether domain knowledge improves agent performance. Experiments on four LLMs reveal that all models diagnose bugs at rates exceeding 91\%, yet resolve rates remain between 10\% and 30\%, suggesting that both iterative code editing capability and domain knowledge play important roles. The specification injection experiment further confirms that 3GPP excerpts improve resolve rates on specification-dependent bugs, while the gains on generic defensive checks remain limited, indicating that the effect of domain knowledge is conditional on bug type.

NIMar 31Code
6GAgentGym: Tool Use, Data Synthesis, and Agentic Learning for Network Management

Jiao Chen, Jianhua Tang, Xiaotong Yang et al.

Autonomous 6G network management requires agents that can execute tools, observe the resulting state changes, and adapt their decisions accordingly. Existing benchmarks based on static questions or scripted episode replay, however, do not support such closed-loop interaction, limiting agents to passive evaluation without the ability to learn from environmental feedback. This paper presents 6GAgentGym to provide closed-loop capability. The framework provides an interactive environment with 42 typed tools whose effect classification distinguishes read-only observation from state-mutating configuration, backed by a learned Experiment Model calibrated on NS-3 simulation data. 6G-Forge bootstraps closed-loop training trajectories from NS-3 seeds via iterative Self-Instruct generation with execution verification against the Experiment Model. Supervised fine-tuning on the resulting corpus followed by reinforcement learning with online closed-loop interaction enables an 8B open-source model to achieve comparable overall success rate to GPT-5 on the accompanying 6GAgentBench, with stronger performance on long-horizon tasks. Together, these components provide a viable path toward autonomous, closed-loop network management.

CVNov 13, 2025
AdaptFly: Prompt-Guided Adaptation of Foundation Models for Low-Altitude UAV Networks

Jiao Chen, Haoyi Wang, Jianhua Tang et al.

Low-altitude Unmanned Aerial Vehicle (UAV) networks rely on robust semantic segmentation as a foundational enabler for distributed sensing-communication-control co-design across heterogeneous agents within the network. However, segmentation foundation models deteriorate quickly under weather, lighting, and viewpoint drift. Resource-limited UAVs cannot run gradient-based test-time adaptation, while resource-massive UAVs adapt independently, wasting shared experience. To address these challenges, we propose AdaptFly, a prompt-guided test-time adaptation framework that adjusts segmentation models without weight updates. AdaptFly features two complementary adaptation modes. For resource-limited UAVs, it employs lightweight token-prompt retrieval from a shared global memory. For resource-massive UAVs, it uses gradient-free sparse visual prompt optimization via Covariance Matrix Adaptation Evolution Strategy. An activation-statistic detector triggers adaptation, while cross-UAV knowledge pool consolidates prompt knowledge and enables fleet-wide collaboration with negligible bandwidth overhead. Extensive experiments on UAVid and VDD benchmarks, along with real-world UAV deployments under diverse weather conditions, demonstrate that AdaptFly significantly improves segmentation accuracy and robustness over static models and state-of-the-art TTA baselines. The results highlight a practical path to resilient, communication-efficient perception in the emerging low-altitude economy.

LGNov 12, 2025
Probing then Editing: A Push-Pull Framework for Retain-Free Machine Unlearning in Industrial IoT

Jiao Chen, Weihua Li, Jianhua Tang

In dynamic Industrial Internet of Things (IIoT) environments, models need the ability to selectively forget outdated or erroneous knowledge. However, existing methods typically rely on retain data to constrain model behavior, which increases computational and energy burdens and conflicts with industrial data silos and privacy compliance requirements. To address this, we propose a novel retain-free unlearning framework, referred to as Probing then Editing (PTE). PTE frames unlearning as a probe-edit process: first, it probes the decision boundary neighborhood of the model on the to-be-forgotten class via gradient ascent and generates corresponding editing instructions using the model's own predictions. Subsequently, a push-pull collaborative optimization is performed: the push branch actively dismantles the decision region of the target class using the editing instructions, while the pull branch applies masked knowledge distillation to anchor the model's knowledge on retained classes to their original states. Benefiting from this mechanism, PTE achieves efficient and balanced knowledge editing using only the to-be-forgotten data and the original model. Experimental results demonstrate that PTE achieves an excellent balance between unlearning effectiveness and model utility across multiple general and industrial benchmarks such as CWRU and SCUT-FD.

LGSep 1, 2025
Forward-Only Continual Learning

Jiao Chen, Jiayi He, Fangfang Chen et al.

Catastrophic forgetting remains a central challenge in continual learning (CL) with pre-trained models. While existing approaches typically freeze the backbone and fine-tune a small number of parameters to mitigate forgetting, they still rely on iterative error backpropagation and gradient-based optimization, which can be computationally intensive and less suitable for resource-constrained environments. To address this, we propose FoRo, a forward-only, gradient-free continual learning method. FoRo consists of a lightweight prompt tuning strategy and a novel knowledge encoding mechanism, both designed without modifying the pre-trained model. Specifically, prompt embeddings are inserted at the input layer and optimized using the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which mitigates distribution shifts and extracts high-quality task representations. Subsequently, task-specific knowledge is encoded into a knowledge encoding matrix via nonlinear random projection and recursive least squares, enabling incremental updates to the classifier without revisiting prior data. Experiments show that FoRo significantly reduces average forgetting and improves accuracy. Thanks to forward-only learning, FoRo reduces memory usage and run time while maintaining high knowledge retention across long task sequences. These results suggest that FoRo could serve as a promising direction for exploring continual learning with pre-trained models, especially in real-world multimedia applications where both efficiency and effectiveness are critical.

CVApr 15, 2025
DMPT: Decoupled Modality-aware Prompt Tuning for Multi-modal Object Re-identification

Minghui Lin, Shu Wang, Xiang Wang et al.

Current multi-modal object re-identification approaches based on large-scale pre-trained backbones (i.e., ViT) have displayed remarkable progress and achieved excellent performance. However, these methods usually adopt the standard full fine-tuning paradigm, which requires the optimization of considerable backbone parameters, causing extensive computational and storage requirements. In this work, we propose an efficient prompt-tuning framework tailored for multi-modal object re-identification, dubbed DMPT, which freezes the main backbone and only optimizes several newly added decoupled modality-aware parameters. Specifically, we explicitly decouple the visual prompts into modality-specific prompts which leverage prior modality knowledge from a powerful text encoder and modality-independent semantic prompts which extract semantic information from multi-modal inputs, such as visible, near-infrared, and thermal-infrared. Built upon the extracted features, we further design a Prompt Inverse Bind (PromptIBind) strategy that employs bind prompts as a medium to connect the semantic prompt tokens of different modalities and facilitates the exchange of complementary multi-modal information, boosting final re-identification results. Experimental results on multiple common benchmarks demonstrate that our DMPT can achieve competitive results to existing state-of-the-art methods while requiring only 6.5% fine-tuning of the backbone parameters.

LGJun 22, 2024
Continual Learning with Diffusion-based Generative Replay for Industrial Streaming Data

Jiayi He, Jiao Chen, Qianmiao Liu et al.

The Industrial Internet of Things (IIoT) integrates interconnected sensors and devices to support industrial applications, but its dynamic environments pose challenges related to data drift. Considering the limited resources and the need to effectively adapt models to new data distributions, this paper introduces a Continual Learning (CL) approach, i.e., Distillation-based Self-Guidance (DSG), to address challenges presented by industrial streaming data via a novel generative replay mechanism. DSG utilizes knowledge distillation to transfer knowledge from the previous diffusion-based generator to the updated one, improving both the stability of the generator and the quality of reproduced data, thereby enhancing the mitigation of catastrophic forgetting. Experimental results on CWRU, DSA, and WISDM datasets demonstrate the effectiveness of DSG. DSG outperforms the state-of-the-art baseline in accuracy, demonstrating improvements ranging from 2.9% to 5.0% on key datasets, showcasing its potential for practical industrial applications.