LGMEJun 1, 2023

TriSig: Assessing the statistical significance of triclusters

arXiv:2306.00643v22 citationsh-index: 18Has Code
AI Analysis

This work addresses the issue of false positive discoveries in tensor data analysis for researchers in fields like disease progression and bioproduction, though it is incremental as it extends existing matrix methods to tensors.

The authors tackled the problem of spurious and redundant patterns in tensor data analysis by proposing a statistical framework to assess the significance of triclusters, extending principles from matrix data, and validated it on real-world biochemical and biotechnological case studies, revealing vulnerabilities in some triclustering algorithms.

Tensor data analysis allows researchers to uncover novel patterns and relationships that cannot be obtained from matrix data alone. The information inferred from the patterns provides valuable insights into disease progression, bioproduction processes, weather fluctuations, and group dynamics. However, spurious and redundant patterns hamper this process. This work aims at proposing a statistical frame to assess the probability of patterns in tensor data to deviate from null expectations, extending well-established principles for assessing the statistical significance of patterns in matrix data. A comprehensive discussion on binomial testing for false positive discoveries is entailed at the light of: variable dependencies, temporal dependencies and misalignments, and \textit{p}-value corrections under the Benjamini-Hochberg procedure. Results gathered from the application of state-of-the-art triclustering algorithms over distinct real-world case studies in biochemical and biotechnological domains confer validity to the proposed statistical frame while revealing vulnerabilities of some triclustering searches. The proposed assessment can be incorporated into existing triclustering algorithms to mitigate false positive/spurious discoveries and further prune the search space, reducing their computational complexity. Availability: The code is freely available at https://github.com/JupitersMight/TriSig under the MIT license.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes