Sima Soltani

20.5SIJun 2

Evidence-Aware Protein Complex Detection: Methods, Benchmarks, and Reproducibility Challenges

Sima Soltani, Mehrdad Jalali, Yahya Forghani et al.

Protein complexes are central units of cellular organization, yet their identification from protein-protein interaction (PPI) networks remains difficult because interactome maps are noisy, incomplete, context dependent, and unevenly annotated. This focused methodological review examines evidence-aware approaches that combine PPI topology with Gene Ontology (GO) annotations, expression profiles, subcellular localization, sequence or domain evidence, temporal information, and representation learning, with emphasis on post-2018 methods and selected historical baselines. The central synthesis is that transparent evidence-aware graph methods currently offer the strongest tradeoff between biological plausibility and reproducibility, while deep, hypergraph, and dynamic heterogeneous models expand biological realism but require stronger benchmark control. The central bottleneck is no longer only the lack of algorithms, but the lack of harmonized, overlap-aware, and reproducible evaluation protocols. We therefore recommend unified benchmark versions, explicit GO-circularity controls, overlap-aware metrics, uncertainty estimates, and executable software packages over isolated source-specific F-measure gains.

4.7SIMay 20

ECHO-PPI: Trustworthy AI for Evidence-Bundled Detection of Overlapping Protein Modules in Protein-Protein Interaction Networks

Sima Soltani, Mehrdad Jalali, Yahya Forghani

Protein-protein interaction networks provide a graph-level view of cellular organization, yet their functional modules are overlapping, noisy, and difficult to interpret from cluster assignments alone. Existing community-detection methods can recover candidate protein complexes, but they rarely explain why an individual protein is assigned to a specific module or whether that assignment should be treated as core, peripheral, or uncertain. Here we introduce ECHO-PPI, an evidence-bundled framework for interpretable overlapping protein-module detection in protein-protein interaction networks. ECHO-PPI integrates weighted network topology, semantic protein profiles, and Gene Ontology evidence to identify evidence-potential nuclei, construct candidate modules, perform overlap-aware assignment, and export hierarchical confidence labels. The framework supports trustworthy computational decision support through assignment-level interpretability: each protein-module assignment is accompanied by topology, semantic, and Gene Ontology evidence scores and a hierarchical confidence label, enabling curators to inspect, rank, and triage overlapping module predictions. Evaluation on yeast protein-interaction data shows that ECHO-PPI preserves the behaviour of strong overlap-aware baselines while adding evidence-bundled auditability. Rather than claiming universal predictive superiority, ECHO-PPI addresses a complementary need: making overlapping protein-module predictions inspectable, confidence-aware, and reproducible for downstream biological interpretation.

Sima Soltani

2 Papers