17.7LGMar 22Code
DMMRL: Disentangled Multi-Modal Representation Learning via Variational Autoencoders for Molecular Property PredictionLong Xu, Junping Guo, Jianbo Zhao et al.
Molecular property prediction constitutes a cornerstone of drug discovery and materials science, necessitating models capable of disentangling complex structure-property relationships across diverse molecular modalities. Existing approaches frequently exhibit entangled representations--conflating structural, chemical, and functional factors--thereby limiting interpretability and transferability. Furthermore, conventional methods inadequately exploit complementary information from graphs, sequences, and geometries, often relying on naive concatenation that neglects inter-modal dependencies. In this work, we propose DMMRL, which employs variational autoencoders to disentangle molecular representations into shared (structure-relevant) and private (modality-specific) latent spaces, enhancing both interpretability and predictive performance. The proposed variational disentanglement mechanism effectively isolates the most informative features for property prediction, while orthogonality and alignment regularizations promote statistical independence and cross-modal consistency. Additionally, a gated attention fusion module adaptively integrates shared representations, capturing complex inter-modal relationships. Experimental validation across seven benchmark datasets demonstrates DMMRL's superior performance relative to state-of-the-art approaches. The code and data underlying this article are freely available at https://github.com/xulong0826/DMMRL.
LGSep 24, 2025Code
MSCoD: An Enhanced Bayesian Updating Framework with Multi-Scale Information Bottleneck and Cooperative Attention for Structure-Based Drug DesignLong Xu, Yongcai Chen, Fengshuo Liu et al.
Structure-Based Drug Design (SBDD) is a powerful strategy in computational drug discovery, utilizing three-dimensional protein structures to guide the design of molecules with improved binding affinity. However, capturing complex protein-ligand interactions across multiple scales remains challenging, as current methods often overlook the hierarchical organization and intrinsic asymmetry of these interactions. To address these limitations, we propose MSCoD, a novel Bayesian updating-based generative framework for structure-based drug design. In our MSCoD, Multi-Scale Information Bottleneck (MSIB) was developed, which enables semantic compression at multiple abstraction levels for efficient hierarchical feature extraction. Furthermore, a multi-head cooperative attention (MHCA) mechanism was developed, which employs asymmetric protein-to-ligand attention to capture diverse interaction types while addressing the dimensionality disparity between proteins and ligands. Empirical studies showed that MSCoD outperforms state-of-the-art methods on the benchmark dataset. Its real-world applicability is confirmed by case studies on difficult targets like KRAS G12D (7XKJ). Additionally, the MSIB and MHCA modules prove transferable, boosting the performance of GraphDTA on standard drug target affinity prediction benchmarks (Davis and Kiba). The code and data underlying this article are freely available at https://github.com/xulong0826/MSCoD.
76.9LGMay 8
CellScientist: Dual-Space Hierarchical Orchestration for Closed-Loop Refinement of Virtual Cell ModelsMengran Li, Bo Li, Jiaying Wang et al.
Virtual Cell Modeling (VCM) requires models that not only predict perturbation responses, but also support targeted revision when predictions fail. Current LLM-assisted modeling workflows face a refinement-routing problem: prediction discrepancies are observed through executable implementations, but the relevant revision may involve the modeling assumption, representation design, implementation, or task constraint. Without structured feedback propagation across these levels, iterative refinement may repair code while failing to revise the assumption responsible for the discrepancy. We propose CellScientist, a dual-space hierarchical framework that couples a high-level hypothesis space with a low-level executable implementation space. CellScientist represents modeling decisions as structured states, realizes them as admissible programs under task and interface constraints, and routes execution discrepancies back to targeted hypothesis or implementation updates. This enables a closed Hypothesis -> Implementation -> Hypothesis loop where failures become structured signals for model refinement rather than debugging events. Across morphology and transcriptomic benchmarks, with additional single-cell perturbation evaluations, the final executable models selected by CellScientist improve over reference baselines under fixed split and evaluation protocols, while the workflow produces auditable refinement traces.
NEApr 1, 2019
A Hybrid Precipitation Prediction Method based on Multicellular Gene Expression ProgrammingHongya Li, Yuzhong Peng, Chuyan Deng et al.
Prompt and accurate precipitation forecast is very important for development management of regional water resource, flood disaster prevention and people's daily activity and production plan; however, non-linear and nonstationary characteristics of precipitation data and noise seriously affect forecast accuracy. This paper combines multicellular gene expression programming with more powerful function mining ability and wavelet analysis with more powerful denoising and extracting data fine feature capability for precipitation forecast modeling, proposing to estimate meteorological precipitation with WTGEPRP algorithm. Comparative result for simulation experiment with actual precipitation data in Zhengzhou, Nanning and Melbourne in Australia indicated that: fitting and forecasting performance of WTGEPRP algorithm is better than the algorithm Multicellular Gene Expression Programming-based Hybrid Model for Precipitation Prediction Coupled with EMD, Supporting Vector Regression, BP Neural Network, Multicellular Gene Expression Programming and Gene Expression Programming, and has good application prospect.
NEApr 1, 2019
A Seft-adaptive Multicellular GEP Algorithm Based On Fuzzy Control For Function OptimizationChuyan Deng, Yuzhong Peng, Hongya Li et al.
To improve the global optimization ability of traditional GEP algorithm, a Multicellular gene expression programming algorithm based on fuzzy control (Multicellular GEP Algorithm Based On Fuzzy Control, MGEP-FC) is proposed. The MGEP-FC algorithm describes the size of cross rate, mutation rate and real number mutation rate by constructing fuzzy membership function. According to the concentration and dispersion of individual fitness values in population, the crossover rate, mutation rate and real number set mutation rate of genetic operation are dynamically adjusted. In order to make the diversity of the population continue in the iterative process, a new genetic operation scheme is designed, which combines the new individuals with the parent population to build a temporary population, and the diversity of the temporary and subpopulation are optimized. The results of 12 Benchmark optimization experiments show that the MGEP-FC algorithm has been greatly improved in stability, global convergence and optimization speed.