AIJun 1Code
SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent TrainingZhongyu He, Yuanfan Li, Fei Huang et al.
Long-horizon LLM agents can benefit from reusable skills, yet existing skill-based methods often rely on external skill generators during training or persistent skill retrieval at inference, increasing engineering complexity, context length, and deployment latency. We propose Self-Internalizing Reinforcement learning with Intrinsic skills (SIRI), a three-phase framework that enables agents to discover, validate, and internalize skills without external skill generators or inference-time skill banks. SIRI first warms up the policy with GiGPO to acquire basic interaction ability and collect successful skill-free trajectories. It then performs self-skill mining, where the current policy summarizes compact skills from its own successful plain rollouts and validates them through paired skill-augmented and skill-free rollouts. Finally, SIRI distills only beneficial skill-guided action tokens into the plain policy using trajectory-level utility and action-level advantage. At inference, the agent runs with the original prompt only. On ALFWorld and WebShop with Qwen2.5-7B-Instruct, SIRI improves GiGPO from 0.908 to 0.930 on ALFWorld and from 0.728 to 0.813 on WebShop, outperforming prompt-based, RL-based, and memory-augmented baselines. Further analysis shows that our self-mining strategy can achieve performance comparable to distillation with closed-source large model. Our code is available at https://github.com/kirito618/SIRI.
LGNov 12, 2025
DeepDR: an integrated deep-learning model web server for drug repositioningShuting Jin, Yi Jiang, Yimin Liu et al.
Background: Identifying new indications for approved drugs is a complex and time-consuming process that requires extensive knowledge of pharmacology, clinical data, and advanced computational methods. Recently, deep learning (DL) methods have shown their capability for the accurate prediction of drug repositioning. However, implementing DL-based modeling requires in-depth domain knowledge and proficient programming skills. Results: In this application, we introduce DeepDR, the first integrated platform that combines a variety of established DL-based models for disease- and target-specific drug repositioning tasks. DeepDR leverages invaluable experience to recommend candidate drugs, which covers more than 15 networks and a comprehensive knowledge graph that includes 5.9 million edges across 107 types of relationships connecting drugs, diseases, proteins/genes, pathways, and expression from six existing databases and a large scientific corpus of 24 million PubMed publications. Additionally, the recommended results include detailed descriptions of the recommended drugs and visualize key patterns with interpretability through a knowledge graph. Conclusion: DeepDR is free and open to all users without the requirement of registration. We believe it can provide an easy-to-use, systematic, highly accurate, and computationally automated platform for both experimental and computational scientists.
LGJul 20, 2021Code
Heterogeneous network-based drug repurposing for COVID-19Shuting Jin, Xiangxiang Zeng, Wei Huang et al.
The Corona Virus Disease 2019 (COVID-19) belongs to human coronaviruses (HCoVs), which spreads rapidly around the world. Compared with new drug development, drug repurposing may be the best shortcut for treating COVID-19. Therefore, we constructed a comprehensive heterogeneous network based on the HCoVs-related target proteins and use the previously proposed deepDTnet, to discover potential drug candidates for COVID-19. We obtain high performance in predicting the possible drugs effective for COVID-19 related proteins. In summary, this work utilizes a powerful heterogeneous network-based deep learning method, which may be beneficial to quickly identify candidate repurposable drugs toward future clinical trials for COVID-19. The code and data are available at https://github.com/stjin-XMU/HnDR-COVID.
LGJan 6, 2025
LOHA: Direct Graph Spectral Contrastive Learning Between Low-pass and High-pass ViewsZiyun Zou, Yinghui Jiang, Lian Shen et al.
Spectral Graph Neural Networks effectively handle graphs with different homophily levels, with low-pass filter mining feature smoothness and high-pass filter capturing differences. When these distinct filters could naturally form two opposite views for self-supervised learning, the commonalities between the counterparts for the same node remain unexplored, leading to suboptimal performance. In this paper, a simple yet effective self-supervised contrastive framework, LOHA, is proposed to address this gap. LOHA optimally leverages low-pass and high-pass views by embracing "harmony in diversity". Rather than solely maximizing the difference between these distinct views, which may lead to feature separation, LOHA harmonizes the diversity by treating the propagation of graph signals from both views as a composite feature. Specifically, a novel high-dimensional feature named spectral signal trend is proposed to serve as the basis for the composite feature, which remains relatively unaffected by changing filters and focuses solely on original feature differences. LOHA achieves an average performance improvement of 2.8% over runner-up models on 9 real-world datasets with varying homophily levels. Notably, LOHA even surpasses fully-supervised models on several datasets, which underscores the potential of LOHA in advancing the efficacy of spectral GNNs for diverse graph structures.
LGNov 25, 2025
Decoupling and Damping: Structurally-Regularized Gradient Matching for Multimodal Graph CondensationLian Shen, Zhendan Chen, Meijia Song et al.
In multimodal graph learning, graph structures that integrate information from multiple sources, such as vision and text, can more comprehensively model complex entity relationships. However, the continuous growth of their data scale poses a significant computational bottleneck for training. Graph condensation methods provide a feasible path forward by synthesizing compact and representative datasets. Nevertheless, existing condensation approaches generally suffer from performance limitations in multimodal scenarios, mainly due to two reasons: (1) semantic misalignment between different modalities leads to gradient conflicts; (2) the message-passing mechanism of graph neural networks further structurally amplifies such gradient noise. Based on this, we propose Structural Regularized Gradient Matching (SR-GM), a condensation framework for multimodal graphs. This method alleviates gradient conflicts between modalities through a gradient decoupling mechanism and introduces a structural damping regularizer to suppress the propagation of gradient noise in the topology, thereby transforming the graph structure from a noise amplifier into a training stabilizer. Extensive experiments on four multimodal graph datasets demonstrate the effectiveness of SR-GM, highlighting its state-of-the-art performance and cross-architecture generalization capabilities in multimodal graph dataset condensation.