Junhan Chen

CV
h-index7
3papers
1citation
Novelty55%
AI Score41

3 Papers

CVMar 4
Seeing as Experts Do: A Knowledge-Augmented Agent for Open-Set Fine-Grained Visual Understanding

Junhan Chen, Zilu Zhou, Yujun Tong et al.

Fine-grained visual understanding is shifting from static classification to knowledge-augmented reasoning, where models must justify as well as recognise. Existing approaches remain limited by closed-set taxonomies and single-label prediction, leading to significant degradation under open-set or context-dependent conditions. We present the Knowledge-Augmented Fine-Grained Reasoning Agent (KFRA), a unified framework that transforms fine-grained perception into evidence-driven reasoning. KFRA operates through a three-stage closed reasoning loop that emulates expert analysis. It first performs open-vocabulary detection and web-scale retrieval to generate category hypotheses. It then conducts discriminative regions localisation by aligning textual knowledge with visual evidence through a global-to-local focusing mechanism. Finally, it integrates all multimodal evidence within a large multimodal model to perform interpretable reasoning. Unlike existing agents that treat retrieval and reasoning as independent processes, KFRA establishes a retrieval-grounding coupling that converts retrieved knowledge into spatially grounded evidence for verification. This design enables factual, interpretable, and task-agnostic reasoning across diverse fine-grained scenarios. To evaluate this capability, we construct FGExpertBench, a benchmark designed to assess reasoning depth and cross-task generalisation across six knowledge dimensions. Extensive experiments demonstrate that KFRA consistently surpasses both standalone large multimodal models and current agent frameworks, achieving up to 19 percent improvement in reasoning accuracy and delivering evidence-grounded interpretability in open-set fine-grained visual understanding.

LGNov 10, 2025
Implicit Federated In-context Learning For Task-Specific LLM Fine-Tuning

Dongcheng Li, Junhan Chen, Aoxiang Zhou et al.

As large language models continue to develop and expand, the extensive public data they rely on faces the risk of depletion. Consequently, leveraging private data within organizations to enhance the performance of large models has emerged as a key challenge. The federated learning paradigm, combined with model fine-tuning techniques, effectively reduces the number of trainable parameters. However,the necessity to process high-dimensional feature spaces results in substantial overall computational overhead. To address this issue, we propose the Implicit Federated In-Context Learning (IFed-ICL) framework. IFed-ICL draws inspiration from federated learning to establish a novel distributed collaborative paradigm, by converting client local context examples into implicit vector representations, it enables distributed collaborative computation during the inference phase and injects model residual streams to enhance model performance. Experiments demonstrate that our proposed method achieves outstanding performance across multiple text classification tasks. Compared to traditional methods, IFed-ICL avoids the extensive parameter updates required by conventional fine-tuning methods while reducing data transmission and local computation at the client level in federated learning. This enables efficient distributed context learning using local private-domain data, significantly improving model performance on specific tasks.

NISep 27, 2025
Impact of Environmental Factors on LoRa 2.4 GHz Time of Flight Ranging Outdoors

Yiqing Zhou, Xule Zhou, Zecan Cheng et al.

In WSN/IoT, node localization is essential to long-running applications for accurate environment monitoring and event detection, often covering a large area in the field. Due to the lower time resolution of typical WSN/IoT platforms (e.g., 1 microsecond on ESP32 platforms) and the jitters in timestamping, packet-level localization techniques cannot provide meter-level resolution. For high-precision localization as well as world-wide interoperability via 2.4-GHz ISM band, a new variant of LoRa, called LoRa 2.4 GHz, was proposed by semtech, which provides a radio frequency (RF) time of flight (ToF) ranging method for meter-level localization. However, the existing datasets reported in the literature are limited in their coverages and do not take into account varying environmental factors such as temperature and humidity. To address these issues, LoRa 2.4 GHz RF ToF ranging data was collected on a sports field at the XJTLU south campus, where three LoRa nodes logged samples of ranging with a LoRa base station, together with temperature and humidity, at reference points arranged as a 3x3 grid covering 400 square meter over three weeks and uploaded all measurement records to the base station equipped with an ESP32-based transceiver for machine and user communications. The results of a preliminary investigation based on a simple deep neural network (DNN) model demonstrate that the environmental factors, including the temperature and humidity, significantly affect the accuracy of ranging, which calls for advanced methods of compensating for the effects of environmental factors on LoRa RF ToF ranging outdoors.