SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning
This tool addresses the challenge of interpreting single-cell genomic data for researchers in bioinformatics, though it appears incremental as it builds on existing multi-kernel learning methods.
The authors tackled the problem of analyzing heterogeneous genomic data by developing SIMLR, a tool that learns a sample-to-sample similarity measure using multi-kernel learning, which improved clustering performance and provided better visualization on public datasets.
We here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a sample-to-sample similarity measure from expression data observed for heterogenous samples. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of samples. SIMLR was benchmarked against state-of-the-art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization. Availability and Implementation SIMLR is available on GitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on http://bioconductor.org.