GNLGNov 15, 2021

Machine Learning for Genomic Data

arXiv:2111.08507v1
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of analyzing gene expression data with limited timepoints for genomics researchers, but it appears incremental as it combines existing methods without clear new breakthroughs.

The paper tackled the problem of applying machine learning to short time-series gene expression data, where standard algorithms often fail, by exploring model-based clustering techniques like K-Means, Gaussian Mixture Models, Bayesian Networks, and Hidden Markov Models combined with Expectation Maximization.

This report explores the application of machine learning techniques on short timeseries gene expression data. Although standard machine learning algorithms work well on longer time-series', they often fail to find meaningful insights from fewer timepoints. In this report, we explore model-based clustering techniques. We combine popular unsupervised learning techniques like K-Means, Gaussian Mixture Models, Bayesian Networks, Hidden Markov Models with the well-known Expectation Maximization algorithm. K-Means and Gaussian Mixture Models are fairly standard, while Hidden Markov Model and Bayesian Networks clustering are more novel ideas that suit time-series gene expression data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes