Jinsong Yang

CV
h-index2
5papers
75citations
Novelty50%
AI Score44

5 Papers

CLSep 1, 2024
Self-evolving Agents with reflective and memory-augmented abilities

Xuechen Liang, Yangfan He, Yinghui Xia et al.

Large language models (LLMs) have made significant advances in the field of natural language processing, but they still face challenges such as continuous decision-making. In this research, we propose a novel framework by integrating iterative feedback, reflective mechanisms, and a memory optimization mechanism based on the Ebbinghaus forgetting curve, it significantly enhances the agents' capabilities in handling multi-tasking and long-span information.

CVAug 27, 2024
Reflective Human-Machine Co-adaptation for Enhanced Text-to-Image Generation Dialogue System

Yuheng Feng, Yangfan He, Yinghui Xia et al.

Today's image generation systems are capable of producing realistic and high-quality images. However, user prompts often contain ambiguities, making it difficult for these systems to interpret users' potential intentions. Consequently, machines need to interact with users multiple rounds to better understand users' intents. The unpredictable costs of using or learning image generation models through multiple feedback interactions hinder their widespread adoption and full performance potential, especially for non-expert users. In this research, we aim to enhance the user-friendliness of our image generation system. To achieve this, we propose a reflective human-machine co-adaptation strategy, named RHM-CAS. Externally, the Agent engages in meaningful language interactions with users to reflect on and refine the generated images. Internally, the Agent tries to optimize the policy based on user preferences, ensuring that the final outcomes closely align with user preferences. Various experiments on different tasks demonstrate the effectiveness of the proposed method.

CLJun 7, 2024Code
AICoderEval: Improving AI Domain Code Generation of Large Language Models

Yinghui Xia, Yuyan Chen, Tianyu Shi et al.

Automated code generation is a pivotal capability of large language models (LLMs). However, assessing this capability in real-world scenarios remains challenging. Previous methods focus more on low-level code generation, such as model loading, instead of generating high-level codes catering for real-world tasks, such as image-to-text, text classification, in various domains. Therefore, we construct AICoderEval, a dataset focused on real-world tasks in various domains based on HuggingFace, PyTorch, and TensorFlow, along with comprehensive metrics for evaluation and enhancing LLMs' task-specific code generation capability. AICoderEval contains test cases and complete programs for automated evaluation of these tasks, covering domains such as natural language processing, computer vision, and multimodal learning. To facilitate research in this area, we open-source the AICoderEval dataset at \url{https://huggingface.co/datasets/vixuowis/AICoderEval}. After that, we propose CoderGen, an agent-based framework, to help LLMs generate codes related to real-world tasks on the constructed AICoderEval. Moreover, we train a more powerful task-specific code generation model, named AICoder, which is refined on llama-3 based on AICoderEval. Our experiments demonstrate the effectiveness of CoderGen in improving LLMs' task-specific code generation capability (by 12.00\% on pass@1 for original model and 9.50\% on pass@1 for ReAct Agent). AICoder also outperforms current code generation LLMs, indicating the great quality of the AICoderEval benchmark.

CVFeb 23
FinSight-Net:A Physics-Aware Decoupled Network with Frequency-Domain Compensation for Underwater Fish Detection in Smart Aquaculture

Jinsong Yang, Zeyuan Hu, Yichen Li et al.

Underwater fish detection (UFD) is a core capability for smart aquaculture and marine ecological monitoring. While recent detectors improve accuracy by stacking feature extractors or introducing heavy attention modules, they often incur substantial computational overhead and, more importantly, neglect the physics that fundamentally limits UFD: wavelength-dependent absorption and turbidity-induced scattering significantly degrade contrast, blur fine structures, and introduce backscattering noise, leading to unreliable localization and recognition. To address these challenges, we propose FinSight-Net, an efficient and physics-aware detection framework tailored for complex aquaculture environments. FinSight-Net introduces a Multi-Scale Decoupled Dual-Stream Processing (MS-DDSP) bottleneck that explicitly targets frequency-specific information loss via heterogeneous convolutional branches, suppressing backscattering artifacts while compensating distorted biological cues through scale-aware and channel-weighted pathways. We further design an Efficient Path Aggregation FPN (EPA-FPN) as a detail-filling mechanism: it restores high-frequency spatial information typically attenuated in deep layers by establishing long-range skip connections and pruning redundant fusion routes, enabling robust detection of non-rigid fish targets under severe blur and turbidity. Extensive experiments on DeepFish, AquaFishSet, and our challenging UW-BlurredFish benchmark demonstrate that FinSight-Net achieves state-of-the-art performance. In particular, on UW-BlurredFish, FinSight-Net reaches 92.8% mAP, outperforming YOLOv11s by 4.8% while reducing parameters by 29.0%, providing a strong and lightweight solution for real-time automated monitoring in smart aquaculture.

CVAug 1, 2025
EPANet: Efficient Path Aggregation Network for Underwater Fish Detection

Jinsong Yang, Zeyuan Hu, Yichen Li

Underwater fish detection (UFD) remains a challenging task in computer vision due to low object resolution, significant background interference, and high visual similarity between targets and surroundings. Existing approaches primarily focus on local feature enhancement or incorporate complex attention mechanisms to highlight small objects, often at the cost of increased model complexity and reduced efficiency. To address these limitations, we propose an efficient path aggregation network (EPANet), which leverages complementary feature integration to achieve accurate and lightweight UFD. EPANet consists of two key components: an efficient path aggregation feature pyramid network (EPA-FPN) and a multi-scale diverse-division short path bottleneck (MS-DDSP bottleneck). The EPA-FPN introduces long-range skip connections across disparate scales to improve semantic-spatial complementarity, while cross-layer fusion paths are adopted to enhance feature integration efficiency. The MS-DDSP bottleneck extends the conventional bottleneck structure by introducing finer-grained feature division and diverse convolutional operations, thereby increasing local feature diversity and representation capacity. Extensive experiments on benchmark UFD datasets demonstrate that EPANet outperforms state-of-the-art methods in terms of detection accuracy and inference speed, while maintaining comparable or even lower parameter complexity.