MLLGSep 13, 2018

MSc Dissertation: Exclusive Row Biclustering for Gene Expression Using a Combinatorial Auction Approach

arXiv:1809.05077v25 citations
AI Analysis

This work addresses a specific clustering need in bioinformatics, such as grouping cancer patients by single cancer types, but is incremental as it builds on existing methods.

The paper tackles the exclusive row biclustering problem for gene expression data, where each row belongs to only one bicluster, by combining existing biclustering algorithms with combinatorial auction techniques and using a Gap statistic approach for threshold tuning; it demonstrates the method on synthetic and real-world data, showing success in identifying large span non-overlapping row submatrices.

The availability of large microarray data has led to a growing interest in biclustering methods in the past decade. Several algorithms have been proposed to identify subsets of genes and conditions according to different similarity measures and under varying constraints. In this paper we focus on the exclusive row biclustering problem for gene expression data sets, in which each row can only be a member of a single bicluster while columns can participate in multiple ones. This type of biclustering may be adequate, for example, for clustering groups of cancer patients where each patient (row) is expected to be carrying only a single type of cancer, while each cancer type is associated with multiple (and possibly overlapping) genes (columns). We present a novel method to identify these exclusive row biclusters through a combination of existing biclustering algorithms and combinatorial auction techniques. We devise an approach for tuning the threshold for our algorithm based on comparison to a null model in the spirit of the Gap statistic approach. We demonstrate our approach on both synthetic and real-world gene expression data and show its power in identifying large span non-overlapping rows sub matrices, while considering their unique nature. The Gap statistic approach succeeds in identifying appropriate thresholds in all our examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes