LGAIBMMNJul 7, 2025

PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs

arXiv:2507.05101v23 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This addresses the need for better evaluation of PPI prediction models to support real-world biological applications like network construction and function annotation, though it is incremental as it builds on existing methods by shifting the evaluation paradigm.

The authors tackled the problem that existing protein-protein interaction (PPI) prediction benchmarks focus on isolated pairs, overlooking network-level capabilities crucial for biology, by introducing PRING, a comprehensive graph-level benchmark with a dataset of 21,484 proteins and 186,818 interactions, which revealed limitations in current models in recovering structural and functional network properties.

Deep learning-based computational methods have achieved promising results in predicting protein-protein interactions (PPIs). However, existing benchmarks predominantly focus on isolated pairwise evaluations, overlooking a model's capability to reconstruct biologically meaningful PPI networks, which is crucial for biology research. To address this gap, we introduce PRING, the first comprehensive benchmark that evaluates protein-protein interaction prediction from a graph-level perspective. PRING curates a high-quality, multi-species PPI network dataset comprising 21,484 proteins and 186,818 interactions, with well-designed strategies to address both data redundancy and leakage. Building on this golden-standard dataset, we establish two complementary evaluation paradigms: (1) topology-oriented tasks, which assess intra and cross-species PPI network construction, and (2) function-oriented tasks, including protein complex pathway prediction, GO module analysis, and essential protein justification. These evaluations not only reflect the model's capability to understand the network topology but also facilitate protein function annotation, biological module detection, and even disease mechanism analysis. Extensive experiments on four representative model categories, consisting of sequence similarity-based, naive sequence-based, protein language model-based, and structure-based approaches, demonstrate that current PPI models have potential limitations in recovering both structural and functional properties of PPI networks, highlighting the gap in supporting real-world biological applications. We believe PRING provides a reliable platform to guide the development of more effective PPI prediction models for the community. The dataset and source code of PRING are available at https://github.com/SophieSarceau/PRING.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes