LGAIGNOct 13, 2024

Lower-dimensional projections of cellular expression improves cell type classification from single-cell RNA sequencing

arXiv:2410.09964v21 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for accurate and efficient cell type classification in biomedical research, offering a simple and resource-friendly solution, though it appears incremental as it builds on existing projection and deep learning techniques.

The researchers tackled the problem of cell type classification from single-cell RNA sequencing data by proposing EnProCell, a reference-based method that uses an ensemble of PCA and MDA for lower-dimensional projections and a deep neural network for classification, achieving high accuracy (up to 99.52%) and F1 scores (up to 99.07%) across multiple datasets.

Single-cell RNA sequencing (scRNA-seq) enables the study of cellular diversity at single cell level. It provides a global view of cell-type specification during the onset of biological mechanisms such as developmental processes and human organogenesis. Various statistical, machine and deep learning-based methods have been proposed for cell-type classification. Most of the methods utilizes unsupervised lower dimensional projections obtained from for a large reference data. In this work, we proposed a reference-based method for cell type classification, called EnProCell. The EnProCell, first, computes lower dimensional projections that capture both the high variance and class separability through an ensemble of principle component analysis and multiple discriminant analysis. In the second phase, EnProCell trains a deep neural network on the lower dimensional representation of data to classify cell types. The proposed method outperformed the existing state-of-the-art methods when tested on four different data sets produced from different single-cell sequencing technologies. The EnProCell showed higher accuracy (98.91) and F1 score (98.64) than other methods for predicting reference from reference datasets. Similarly, EnProCell also showed better performance than existing methods in predicting cell types for data with unknown cell types (query) from reference datasets (accuracy:99.52; F1 score: 99.07). In addition to improved performance, the proposed methodology is simple and does not require more computational resources and time. the EnProCell is available at https://github.com/umar1196/EnProCell.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes