CLAug 20, 2025Code
MedResearcher-R1: Expert-Level Medical Deep Researcher via A Knowledge-Informed Trajectory Synthesis FrameworkAiling Yu, Lan Yao, Jingnan Liu et al.
Recent developments in Large Language Model (LLM)-based agents have shown impressive capabilities spanning multiple domains, exemplified by deep research systems that demonstrate superior performance on complex information-seeking and synthesis tasks. While general-purpose deep research agents have shown impressive capabilities, they struggle significantly with medical domain challenges, as evidenced by leading proprietary systems achieving limited accuracy on complex medical benchmarks. The key limitations are: (1) the model lacks sufficient dense medical knowledge for clinical reasoning, and (2) the framework is constrained by the absence of specialized retrieval tools tailored for medical contexts. We present a medical deep research agent that addresses these challenges through two core innovations. First, we develop a novel data synthesis framework using medical knowledge graphs, extracting the longest chains from subgraphs around rare medical entities to generate complex multi-hop question-answer pairs. Second, we integrate a custom-built private medical retrieval engine alongside general-purpose tools, enabling accurate medical information synthesis. Our approach generates 2100+ diverse trajectories across 12 medical specialties, each averaging 4.2 tool interactions. Through a two-stage training paradigm combining supervised fine-tuning and online reinforcement learning with composite rewards, our MedResearcher-R1-32B model demonstrates exceptional performance, establishing new state-of-the-art results on medical benchmarks while maintaining competitive performance on general deep research tasks. Our work demonstrates that strategic domain-specific innovations in architecture, tool design, and training data construction can enable smaller open-source models to outperform much larger proprietary systems in specialized domains.
CVOct 13, 2024Code
Combining Generative and Geometry Priors for Wide-Angle Portrait CorrectionLan Yao, Chaofeng Chen, Xiaoming Li et al.
Wide-angle lens distortion in portrait photography presents a significant challenge for capturing photo-realistic and aesthetically pleasing images. Such distortions are especially noticeable in facial regions. In this work, we propose encapsulating the generative face prior as a guided natural manifold to facilitate the correction of facial regions. Moreover, a notable central symmetry relationship exists in the non-face background, yet it has not been explored in the correction process. This geometry prior motivates us to introduce a novel constraint to explicitly enforce symmetry throughout the correction process, thereby contributing to a more visually appealing and natural correction in the non-face region. Experiments demonstrate that our approach outperforms previous methods by a large margin, excelling not only in quantitative measures such as line straightness and shape consistency metrics but also in terms of perceptual visual quality. All the code and models are available at https://github.com/Dev-Mrha/DualPriorsCorrection.
BMApr 2, 2025
HCAF-DTA: drug-target binding affinity prediction with cross-attention fused hypergraph neural networksJiannuo Li, Lan Yao
Accurate prediction of the binding affinity between drugs and target proteins is a core task in computer-aided drug design. Existing deep learning methods tend to ignore the information of internal sub-structural features of drug molecules and drug-target interactions, resulting in limited prediction performance. In this paper, we propose a drug-target association prediction model HCAF-DTA based on cross-attention fusion hypergraph neural network. The model innovatively introduces hypergraph representation in the feature extraction stage: drug molecule hypergraphs are constructed based on the tree decomposition algorithm, and the sub-structural and global features extracted by fusing the hypergraph neural network with the graphical neural network through hopping connections, in which the hyper edges can efficiently characterise the functional functional groups and other key chemical features; for the protein feature extraction, a weighted graph is constructed based on the residues predicted by the ESM model contact maps to construct weighted graphs, and multilayer graph neural networks were used to capture spatial dependencies. In the prediction stage, a bidirectional multi-head cross-attention mechanism is designed to model intermolecular interactions from the dual viewpoints of atoms and amino acids, and cross-modal features with correlated information are fused by attention. Experiments on benchmark datasets such as Davis and KIBA show that HCAF-DTA outperforms state of the arts in all three performance evaluation metrics, with the MSE metrics reaching 0.198 and 0.122, respectively, with an improvement of up to 4% from the optimal baseline.
LGApr 1, 2025
Brain Network Classification Based on Graph Contrastive Learning and Graph TransformerZhiTeng Zhu, Lan Yao
The dynamic characterization of functional brain networks is of great significance for elucidating the mechanisms of human brain function. Although graph neural networks have achieved remarkable progress in functional network analysis, challenges such as data scarcity and insufficient supervision persist. To address the limitations of limited training data and inadequate supervision, this paper proposes a novel model named PHGCL-DDGformer that integrates graph contrastive learning with graph transformers, effectively enhancing the representation learning capability for brain network classification tasks. To overcome the constraints of existing graph contrastive learning methods in brain network feature extraction, an adaptive graph augmentation strategy combining attribute masking and edge perturbation is implemented for data enhancement. Subsequently, a dual-domain graph transformer (DDGformer) module is constructed to integrate local and global information, where graph convolutional networks aggregate neighborhood features to capture local patterns while attention mechanisms extract global dependencies. Finally, a graph contrastive learning framework is established to maximize the consistency between positive and negative pairs, thereby obtaining high-quality graph representations. Experimental results on real-world datasets demonstrate that the PHGCL-DDGformer model outperforms existing state-of-the-art approaches in brain network classification tasks.
IRJan 25, 2024
Within-basket Recommendation via Neural Pattern AssociatorKai Luo, Tianshu Shen, Lan Yao et al.
Within-basket recommendation (WBR) refers to the task of recommending items to the end of completing a non-empty shopping basket during a shopping session. While the latest innovations in this space demonstrate remarkable performance improvement on benchmark datasets, they often overlook the complexity of user behaviors in practice, such as 1) co-existence of multiple shopping intentions, 2) multi-granularity of such intentions, and 3) interleaving behavior (switching intentions) in a shopping session. This paper presents Neural Pattern Associator (NPA), a deep item-association-mining model that explicitly models the aforementioned factors. Specifically, inspired by vector quantization, the NPA model learns to encode common user intentions (or item-combination patterns) as quantized representations (a.k.a. codebook), which permits identification of users's shopping intentions via attention-driven lookup during the reasoning phase. This yields coherent and self-interpretable recommendations. We evaluated the proposed NPA model across multiple extensive datasets, encompassing the domains of grocery e-commerce (shopping basket completion) and music (playlist extension), where our quantitative evaluations show that the NPA model significantly outperforms a wide range of existing WBR solutions, reflecting the benefit of explicitly modeling complex user intentions.