Lijie Su

LG
h-index7
3papers
Novelty53%
AI Score42

3 Papers

LGMar 11Code
Multi-objective Genetic Programming with Multi-view Multi-level Feature for Enhanced Protein Secondary Structure Prediction

Yining Qian, Lijie Su, Meiling Xu et al.

Predicting protein secondary structure is essential for understanding protein function and advancing drug discovery. However, the intricate sequence-structure relationship poses significant challenges for accurate modeling. To address these, we propose MOGP-MMF, a multi-objective genetic programming framework that reformulates PSSP as an automated optimization task focused on feature selection and fusion. Specifically, MOGP-MMF introduces a multi-view multi-level representation strategy that integrates evolutionary, semantic, and newly introduced structural views to capture the comprehensive protein folding logic. Leveraging an enriched operator set, the framework evolves both linear and nonlinear fusion functions, effectively capturing high-order feature interactions while reducing fusion complexity. To resolve the accuracy-complexity trade-off, an improved multi-objective GP algorithm is developed, incorporating a knowledge transfer mechanism that utilizes prior evolutionary experience to guide the population toward global optima. Extensive experiments across seven benchmark datasets demonstrate that MOGP-MMF surpasses state-of-the-art methods, particularly in Q8 accuracy and structural integrity. Furthermore, MOGP-MMF generates a diverse set of non-dominated solutions, offering flexible model selection schemes for various practical application scenarios. The source code is available on GitHub: https://github.com/qian-ann/MOGP-MMF/tree/main.

IRMar 23, 2025Code
Dynamic Topic Analysis in Academic Journals using Convex Non-negative Matrix Factorization Method

Yang Yang, Tong Zhang, Jian Wu et al.

With the rapid advancement of large language models, academic topic identification and topic evolution analysis are crucial for enhancing AI's understanding capabilities. Dynamic topic analysis provides a powerful approach to capturing and understanding the temporal evolution of topics in large-scale datasets. This paper presents a two-stage dynamic topic analysis framework that incorporates convex optimization to improve topic consistency, sparsity, and interpretability. In Stage 1, a two-layer non-negative matrix factorization (NMF) model is employed to extract annual topics and identify key terms. In Stage 2, a convex optimization algorithm refines the dynamic topic structure using the convex NMF (cNMF) model, further enhancing topic integration and stability. Applying the proposed method to IEEE journal abstracts from 2004 to 2022 effectively identifies and quantifies emerging research topics, such as COVID-19 and digital twins. By optimizing sparsity differences in the clustering feature space between traditional and emerging research topics, the framework provides deeper insights into topic evolution and ranking analysis. Moreover, the NMF-cNMF model demonstrates superior stability in topic consistency. At sparsity levels of 0.4, 0.6, and 0.9, the proposed approach improves topic ranking stability by 24.51%, 56.60%, and 36.93%, respectively. The source code (to be open after publication) is available at https://github.com/meetyangyang/CDNMF.

LGMay 31, 2023
A Novel Black Box Process Quality Optimization Approach based on Hit Rate

Yang Yang, Jian Wu, Xiangman Song et al.

Hit rate is a key performance metric in predicting process product quality in integrated industrial processes. It represents the percentage of products accepted by downstream processes within a controlled range of quality. However, optimizing hit rate is a non-convex and challenging problem. To address this issue, we propose a data-driven quasi-convex approach that combines factorial hidden Markov models, multitask elastic net, and quasi-convex optimization. Our approach converts the original non-convex problem into a set of convex feasible problems, achieving an optimal hit rate. We verify the convex optimization property and quasi-convex frontier through Monte Carlo simulations and real-world experiments in steel production. Results demonstrate that our approach outperforms classical models, improving hit rates by at least 41.11% and 31.01% on two real datasets. Furthermore, the quasi-convex frontier provides a reference explanation and visualization for the deterioration of solutions obtained by conventional models.