Yan-Li Lee

LG
h-index3
4papers
86citations
Novelty53%
AI Score39

4 Papers

LGJun 6, 2017Code
A generalized method toward drug-target interaction prediction via low-rank matrix projection

Ratha Pech, Dong Hao, Yan-Li Lee et al.

Drug-target interaction (DTI) prediction plays a very important role in drug development and drug discovery. Biochemical experiments or \textit{in vitro} methods are very expensive, laborious and time-consuming. Therefore, \textit{in silico} approaches including docking simulation and machine learning have been proposed to solve this problem. In particular, machine learning approaches have attracted increasing attentions recently. However, in addition to the known drug-target interactions, most of the machine learning methods require extra characteristic information such as chemical structures, genome sequences, binding types and so on. Whenever such information is not available, they may perform poor. Very recently, the similarity-based link prediction methods were extended to bipartite networks, which can be applied to solve the DTI prediction problem by using topological information only. In this work, we propose a method based on low-rank matrix projection to solve the DTI prediction problem. On one hand, when there is no extra characteristic information of drugs or targets, the proposed method utilizes only the known interactions. On the other hand, the proposed method can also utilize the extra characteristic information when it is available and the performances will be remarkably improved. Moreover, the proposed method can predict the interactions associated with new drugs or targets of which we know nothing about their associated interactions, but only some characteristic information. We compare the proposed method with ten baseline methods, e.g., six similarity-based methods that utilize only the known interactions and four methods that utilize the extra characteristic information. The datasets and codes implementing the simulations are available at https://github.com/rathapech/DTI_LMP.

LGSep 10, 2025
Data Skeleton Learning: Scalable Active Clustering with Sparse Graph Structures

Wen-Bo Xie, Xun Fu, Bin Chen et al.

In this work, we focus on the efficiency and scalability of pairwise constraint-based active clustering, crucial for processing large-scale data in applications such as data mining, knowledge annotation, and AI model pre-training. Our goals are threefold: (1) to reduce computational costs for iterative clustering updates; (2) to enhance the impact of user-provided constraints to minimize annotation requirements for precise clustering; and (3) to cut down memory usage in practical deployments. To achieve these aims, we propose a graph-based active clustering algorithm that utilizes two sparse graphs: one for representing relationships between data (our proposed data skeleton) and another for updating this data skeleton. These two graphs work in concert, enabling the refinement of connected subgraphs within the data skeleton to create nested clusters. Our empirical analysis confirms that the proposed algorithm consistently facilitates more accurate clustering with dramatically less input of user-provided constraints, and outperforms its counterparts in terms of computational performance and scalability, while maintaining robustness across various distance metrics.

SIMar 14, 2021
Collaborative Filtering Approach to Link Prediction

Yan-Li Lee, Tao Zhou

Link prediction is a fundamental challenge in network science. Among various methods, local similarity indices are widely used for their high cost-performance. However, the performance is less robust: for some networks local indices are highly competitive to state-of-the-art algorithms while for some other networks they are very poor. Inspired by techniques developed for recommender systems, we propose an enhancement framework for local indices based on collaborative filtering (CF). Considering the delicate but important difference between personalized recommendation and link prediction, we further propose an improved framework named as self-included collaborative filtering (SCF). The SCF framework significantly improved the accuracy and robustness of well-known local indices. The combination of SCF framework and a simple local index can produce an index with competitive performance and much lower complexity compared with elaborately-designed state-of-the-art algorithms.

IRJul 9, 2019
Hierarchical Clustering Supported by Reciprocal Nearest Neighbors

Wen-Bo Xie, Yan-Li Lee, Cong Wang et al.

Clustering is a fundamental analysis tool aiming at classifying data points into groups based on their similarity or distance. It has found successful applications in all natural and social sciences, including biology, physics, economics, chemistry, astronomy, psychology, and so on. Among numerous existent algorithms, hierarchical clustering algorithms are of a particular advantage as they can provide results under different resolutions without any predetermined number of clusters and unfold the organization of resulted clusters. At the same time, they suffer a variety of drawbacks and thus are either time-consuming or inaccurate. We propose a novel hierarchical clustering approach on the basis of a simple hypothesis that two reciprocal nearest data points should be grouped in one cluster. Extensive tests on data sets across multiple domains show that our method is much faster and more accurate than the state-of-the-art benchmarks. We further extend our method to deal with the community detection problem in real networks, achieving remarkably better results in comparison with the well-known Girvan-Newman algorithm.