LG IR MLFeb 24, 2021

Similarity measure for sparse time course data based on Gaussian processes

arXiv:2102.12342v11.6Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses a domain-specific problem for researchers analyzing sparse biological time series, but it is incremental as it builds on existing Gaussian process and clustering methods.

The authors tackled the problem of measuring similarity in sparsely sampled time course data, such as in gene transcriptomics, by proposing a Gaussian process-based log-likelihood ratio measure that enhances robustness to noise, and they demonstrated improved performance in clustering experiments on synthetic and real data.

We propose a similarity measure for sparsely sampled time course data in the form of a log-likelihood ratio of Gaussian processes (GP). The proposed GP similarity is similar to a Bayes factor and provides enhanced robustness to noise in sparse time series, such as those found in various biological settings, e.g., gene transcriptomics. We show that the GP measure is equivalent to the Euclidean distance when the noise variance in the GP is negligible compared to the noise variance of the signal. Our numerical experiments on both synthetic and real data show improved performance of the GP similarity when used in conjunction with two distance-based clustering methods.

View on arXiv PDF Code

Similar