Tensor clustering with algebraic constraints gives interpretable groups of crosstalk mechanisms in breast cancer
This work addresses the challenge of interpreting complex biological datasets for cancer researchers, though it is incremental as it builds on existing tensor and clustering methods.
The authors tackled the problem of extracting interpretable clusters from high-dimensional, multi-indexed biological data by introducing a tensor-based clustering method with algebraic constraints, applied to breast cancer cell line responses to ligands, resulting in quantified heterogeneity and hypotheses about signalling mechanisms.
We introduce a tensor-based clustering method to extract sparse, low-dimensional structure from high-dimensional, multi-indexed datasets. This framework is designed to enable detection of clusters of data in the presence of structural requirements which we encode as algebraic constraints in a linear program. Our clustering method is general and can be tailored to a variety of applications in science and industry. We illustrate our method on a collection of experiments measuring the response of genetically diverse breast cancer cell lines to an array of ligands. Each experiment consists of a cell line-ligand combination, and contains time-course measurements of the early-signalling kinases MAPK and AKT at two different ligand dose levels. By imposing appropriate structural constraints and respecting the multi-indexed structure of the data, the analysis of clusters can be optimized for biological interpretation and therapeutic understanding. We then perform a systematic, large-scale exploration of mechanistic models of MAPK-AKT crosstalk for each cluster. This analysis allows us to quantify the heterogeneity of breast cancer cell subtypes, and leads to hypotheses about the signalling mechanisms that mediate the response of the cell lines to ligands.