Jiapeng Yu

CV
h-index10
4papers
14citations
Novelty45%
AI Score47

4 Papers

SESep 2, 2024Code
Co-Learning: Code Learning for Multi-Agent Reinforcement Collaborative Framework with Conversational Natural Language Interfaces

Jiapeng Yu, Yuqian Wu, Yajing Zhan et al.

Online question-and-answer (Q\&A) systems based on the Large Language Model (LLM) have progressively diverged from recreational to professional use. This paper proposed a Multi-Agent framework with environmentally reinforcement learning (E-RL) for code correction called Code Learning (Co-Learning) community, assisting beginners to correct code errors independently. It evaluates the performance of multiple LLMs from an original dataset with 702 error codes, uses it as a reward or punishment criterion for E-RL; Analyzes input error codes by the current agent; selects the appropriate LLM-based agent to achieve optimal error correction accuracy and reduce correction time. Experiment results showed that 3\% improvement in Precision score and 15\% improvement in time cost as compared with no E-RL method respectively. Our source code is available at: https://github.com/yuqian2003/Co_Learning

IRSep 5, 2024Code
MAS4POI: a Multi-Agents Collaboration System for Next POI Recommendation

Yuqian Wu, Yuhong Peng, Jiapeng Yu et al.

LLM-based Multi-Agent Systems have potential benefits of complex decision-making tasks management across various domains but their applications in the next Point-of-Interest (POI) recommendation remain underexplored. This paper proposes a novel MAS4POI system designed to enhance next POI recommendations through multi-agent interactions. MAS4POI supports Large Language Models (LLMs) specializing in distinct agents such as DataAgent, Manager, Analyst, and Navigator with each contributes to a collaborative process of generating the next POI recommendations.The system is examined by integrating six distinct LLMs and evaluated by two real-world datasets for recommendation accuracy improvement in real-world scenarios. Our code is available at https://github.com/yuqian2003/MAS4POI.

62.7CVApr 9
InstrAct: Towards Action-Centric Understanding in Instructional Videos

Zhuoyi Yang, Jiapeng Yu, Reuben Tan et al.

Understanding instructional videos requires recognizing fine-grained actions and modeling their temporal relations, which remains challenging for current Video Foundation Models (VFMs). This difficulty stems from noisy web supervision and a pervasive "static bias", where models rely on objects rather than motion cues. To address this, we propose InstrAction, a pretraining framework for instructional videos' action-centric representations. We first introduce a data-driven strategy, which filters noisy captions and generates action-centric hard negatives to disentangle actions from objects during contrastive learning. At the visual feature level, an Action Perceiver extracts motion-relevant tokens from redundant video encodings. Beyond contrastive learning, we introduce two auxiliary objectives: Dynamic Time Warping alignment (DTW-Align) for modeling sequential temporal structure, and Masked Action Modeling (MAM) for strengthening cross-modal grounding. Finally, we introduce the InstrAct Bench to evaluate action-centric understanding, where our method consistently outperforms state-of-the-art VFMs on semantic reasoning, procedural logic, and fine-grained retrieval tasks.

LGSep 15, 2025Code
Beyond Regularity: Modeling Chaotic Mobility Patterns for Next Location Prediction

Yuqian Wu, Yuhong Peng, Jiapeng Yu et al.

Next location prediction is a key task in human mobility analysis, crucial for applications like smart city resource allocation and personalized navigation services. However, existing methods face two significant challenges: first, they fail to address the dynamic imbalance between periodic and chaotic mobile patterns, leading to inadequate adaptation over sparse trajectories; second, they underutilize contextual cues, such as temporal regularities in arrival times, which persist even in chaotic patterns and offer stronger predictability than spatial forecasts due to reduced search spaces. To tackle these challenges, we propose \textbf{\method}, a \underline{\textbf{C}}h\underline{\textbf{A}}otic \underline{\textbf{N}}eural \underline{\textbf{O}}scillator n\underline{\textbf{E}}twork for next location prediction, which introduces a biologically inspired Chaotic Neural Oscillatory Attention mechanism to inject adaptive variability into traditional attention, enabling balanced representation of evolving mobility behaviors, and employs a Tri-Pair Interaction Encoder along with a Cross Context Attentive Decoder to fuse multimodal ``who-when-where'' contexts in a joint framework for enhanced prediction performance. Extensive experiments on two real-world datasets demonstrate that CANOE consistently and significantly outperforms a sizeable collection of state-of-the-art baselines, yielding 3.17\%-13.11\% improvement over the best-performing baselines across different cases. In particular, CANOE can make robust predictions over mobility trajectories of different mobility chaotic levels. A series of ablation studies also supports our key design choices. Our code is available at: https://github.com/yuqian2003/CANOE.