ML LGApr 29, 2018

Simultaneous Parameter Learning and Bi-Clustering for Multi-Response Models

Ming Yu, Karthikeyan Natesan Ramamurthy, Addie Thompson, Aurélie Lozano

arXiv:1804.10961v12.72 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for accurate parameter estimation and structure discovery in applications like multi-response Genome-Wide Association Studies, though it appears incremental as it builds on existing convex regularization methods.

The paper tackles the problem of estimating parameter matrices with unknown grouping structures in multi-response regression models, proposing two convex regularization formulations that simultaneously learn parameters and group structures, validated on simulations and real plant genotype-phenotype datasets.

We consider multi-response and multitask regression models, where the parameter matrix to be estimated is expected to have an unknown grouping structure. The groupings can be along tasks, or features, or both, the last one indicating a bi-cluster or "checkerboard" structure. Discovering this grouping structure along with parameter inference makes sense in several applications, such as multi-response Genome-Wide Association Studies. This additional structure can not only can be leveraged for more accurate parameter estimation, but it also provides valuable information on the underlying data mechanisms (e.g. relationships among genotypes and phenotypes in GWAS). In this paper, we propose two formulations to simultaneously learn the parameter matrix and its group structures, based on convex regularization penalties. We present optimization approaches to solve the resulting problems and provide numerical convergence guarantees. Our approaches are validated on extensive simulations and real datasets concerning phenotypes and genotypes of plant varieties.

View on arXiv PDF

Similar