LGJun 2, 2015

On bicluster aggregation and its benefits for enumerative solutions

arXiv:1506.01077v17 citations
Originality Incremental advance
AI Analysis

This addresses a specific challenge in biclustering analysis for noisy data, representing an incremental improvement over existing methods.

The paper tackles the problem of bicluster fragmentation and overlapping in noisy datasets by proposing two aggregation approaches, which significantly reduce the number of biclusters and consistently increase solution quality.

Biclustering involves the simultaneous clustering of objects and their attributes, thus defining local two-way clustering models. Recently, efficient algorithms were conceived to enumerate all biclusters in real-valued datasets. In this case, the solution composes a complete set of maximal and non-redundant biclusters. However, the ability to enumerate biclusters revealed a challenging scenario: in noisy datasets, each true bicluster may become highly fragmented and with a high degree of overlapping. It prevents a direct analysis of the obtained results. To revert the fragmentation, we propose here two approaches for properly aggregating the whole set of enumerated biclusters: one based on single linkage and the other directly exploring the rate of overlapping. Both proposals were compared with each other and with the actual state-of-the-art in several experiments, and they not only significantly reduced the number of biclusters but also consistently increased the quality of the solution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes