MLLGPRSTAPJan 24, 2025

Statistical Verification of Linear Classifiers

arXiv:2501.14430v1h-index: 7Stat
Originality Synthesis-oriented
AI Analysis

This work provides a statistical verification method for linear classifiers, specifically applied to gene expression analysis in breast cancer, but it is incremental as it focuses on refining bounds for a known test.

The authors tackled the problem of verifying whether a linear classifier meaningfully distinguishes between two classes by proposing a homogeneity test related to linear separability, and they applied it to confirm the significance of IGFBP6 and ELOVL5 genes in detecting ER-positive breast cancer recurrence.

We propose a homogeneity test closely related to the concept of linear separability between two samples. Using the test one can answer the question whether a linear classifier is merely ``random'' or effectively captures differences between two classes. We focus on establishing upper bounds for the test's \emph{p}-value when applied to two-dimensional samples. Specifically, for normally distributed samples we experimentally demonstrate that the upper bound is highly accurate. Using this bound, we evaluate classifiers designed to detect ER-positive breast cancer recurrence based on gene pair expression. Our findings confirm significance of IGFBP6 and ELOVL5 genes in this process.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes