Anna Seigal

ML
h-index16
11papers
243citations
Novelty47%
AI Score51

11 Papers

NAMay 20
Multi-subspace power method for decomposing partially symmetric tensors

Kexin Wang, João M. Pereira, Joe Kileel et al.

We present an algorithm for low rank decomposition of tensors of any symmetry type, from fully asymmetric to fully symmetric. It recovers the decomposition one summand at a time via the higher-order power method. This approach is known to fail in general: there need not be a relationship between the summands of a decomposition and the (partially symmetric) singular vector tuples (pSVTs) of the tensor. Our approach overcomes this problem by transforming the input to a tensor with orthonormal slices, via orthogonalization of a flattening. The summands of the decomposition of the original tensor can be recovered from the pSVTs of this new transformed tensor. We introduce a shifted power method for computing pSVTs and prove its global convergence. Numerical experiments demonstrate that our algorithm achieves higher accuracy and faster runtime than existing methods.

MLNov 29, 2022
Linear Causal Disentanglement via Interventions

Chandler Squires, Anna Seigal, Salil Bhate et al.

Causal disentanglement seeks a representation of data involving latent variables that relate to one another via a causal model. A representation is identifiable if both the latent model and the transformation from latent to observed variables are unique. In this paper, we study observed variables that are a linear transformation of a linear latent causal model. Data from interventions are necessary for identifiability: if one latent variable is missing an intervention, we show that there exist distinct models that cannot be distinguished. Conversely, we show that a single intervention on each latent variable is sufficient for identifiability. Our proof uses a generalization of the RQ decomposition of a matrix that replaces the usual orthogonal and upper triangular conditions with analogues depending on a partial order on the rows of the matrix, with partial order determined by a latent causal model. We corroborate our theoretical results with a method for causal disentanglement that accurately recovers a latent causal model.

MEMay 6
Causal discovery under mean independence and linearity

Geert Mesters, Alvaro Ribot, Anna Seigal et al.

Causal discovery methods such as LiNGAM identify causal structure from observational data by assuming mutually independent disturbances. This assumption is fragile: shared volatility, common scale effects, or other forms of dependence can cause the methods to recover the wrong causal order, even with infinite data. We introduce the Linear Mean-Independent Acyclic Model (LiMIAM), which replaces full independence with weaker one-sided mean-independence restrictions on the disturbances. Under finite-order consequences of these restrictions, source nodes are generically identifiable, and hence a compatible causal order can be recovered recursively. Our proof is constructive and leads to DirectLiMIAM, a sequential residual-based algorithm for causal discovery under dependent noise. In simulations with mean-independent but dependent disturbances, DirectLiMIAM outperforms LiNGAM methods. A large-scale empirical application to the oil market highlights the implausibility of the independence assumption and the ability of DirectLiMIAM to recover a realistic causal ordering, from policy to production and from prices to inflation.

MLJul 5, 2024
Linear causal disentanglement via higher-order cumulants

Paula Leyes Carreno, Chiara Meroni, Anna Seigal

Linear causal disentanglement is a recent method in causal representation learning to describe a collection of observed variables via latent variables with causal dependencies between them. It can be viewed as a generalization of both independent component analysis and linear structural equation models. We study the identifiability of linear causal disentanglement, assuming access to data under multiple contexts, each given by an intervention on a latent variable. We show that one perfect intervention on each latent variable is sufficient and in the worst case necessary to recover parameters under perfect interventions, generalizing previous work to allow more latent than observed variables. We give a constructive proof that computes parameters via a coupled tensor decomposition. For soft interventions, we find the equivalence class of latent graphs and parameters that are consistent with observed data, via the study of a system of polynomial equations. Our results hold assuming the existence of non-zero higher-order cumulants, which implies non-Gaussianity of variables.

MLJan 21
Multi-context principal component analysis

Kexin Wang, Salil Bhate, João M. Pereira et al.

Principal component analysis (PCA) is a tool to capture factors that explain variation in data. Across domains, data are now collected across multiple contexts (for example, individuals with different diseases, cells of different types, or words across texts). While the factors explaining variation in data are undoubtedly shared across subsets of contexts, no tools currently exist to systematically recover such factors. We develop multi-context principal component analysis (MCPCA), a theoretical and algorithmic framework that decomposes data into factors shared across subsets of contexts. Applied to gene expression, MCPCA reveals axes of variation shared across subsets of cancer types and an axis whose variability in tumor cells, but not mean, is associated with lung cancer progression. Applied to contextualized word embeddings from language models, MCPCA maps stages of a debate on human nature, revealing a discussion between science and fiction over decades. These axes are not found by combining data across contexts or by restricting to individual contexts. MCPCA is a principled generalization of PCA to address the challenge of understanding factors underlying data across contexts.

STOct 8, 2025
Beyond independent component analysis: identifiability and algorithms

Alvaro Ribot, Anna Seigal, Piotr Zwiernik

Independent Component Analysis (ICA) is a classical method for recovering latent variables with useful identifiability properties. For independent variables, cumulant tensors are diagonal; relaxing independence yields tensors whose zero structure generalizes diagonality. These models have been the subject of recent work in non-independent component analysis. We show that pairwise mean independence answers the question of how much one can relax independence: it is identifiable, any weaker notion is non-identifiable, and it contains the models previously studied as special cases. Our results apply to distributions with the required zero pattern at any cumulant tensor. We propose an algebraic recovery algorithm based on least-squares optimization over the orthogonal group. Simulations highlight robustness: enforcing full independence can harm estimation, while pairwise mean independence enables more stable recovery. These findings extend the classical ICA framework and provide a rigorous basis for blind source separation beyond independence.

COMay 24, 2023
Supermodular Rank: Set Function Decomposition and Optimization

Rishi Sonthalia, Anna Seigal, Guido Montufar

We define the supermodular rank of a function on a lattice. This is the smallest number of terms needed to decompose it into a sum of supermodular functions. The supermodular summands are defined with respect to different partial orders. We characterize the maximum possible value of the supermodular rank and describe the functions with fixed supermodular rank. We analogously define the submodular rank. We use submodular decompositions to optimize set functions. Given a bound on the submodular rank of a set function, we formulate an algorithm that splits an optimization problem into submodular subproblems. We show that this method improves the approximation ratio guarantees of several algorithms for monotone set function maximization and ratio of set functions minimization, at a computation overhead that depends on the submodular rank.

NASep 5, 2018
Learning Paths from Signature Tensors

Max Pfeffer, Anna Seigal, Bernd Sturmfels

Matrix congruence extends naturally to the setting of tensors. We apply methods from tensor decomposition, algebraic geometry and numerical optimization to this group action. Given a tensor in the orbit of another tensor, we compute a matrix which transforms one to the other. Our primary application is an inverse problem from stochastic analysis: the recovery of paths from their third order signature tensors. We establish identifiability results, both exact and numerical, for piecewise linear paths, polynomial paths, and generic dictionaries. Numerical optimization is applied for recovery from inexact data. We also compute the shortest path with a given signature tensor.

STOct 4, 2017
Duality of Graphical Models and Tensor Networks

Elina Robeva, Anna Seigal

In this article we show the duality between tensor networks and undirected graphical models with discrete variables. We study tensor networks on hypergraphs, which we call tensor hypernetworks. We show that the tensor hypernetwork on a hypergraph exactly corresponds to the graphical model given by the dual hypergraph. We translate various notions under duality. For example, marginalization in a graphical model is dual to contraction in the tensor network. Algorithms also translate under duality. We show that belief propagation corresponds to a known algorithm for tensor network contraction. This article is a reminder that the research areas of graphical models and tensor networks can benefit from interaction.

MLSep 15, 2017
Mixtures and products in two graphical models

Anna Seigal, Guido Montufar

We compare two statistical models of three binary random variables. One is a mixture model and the other is a product of mixtures model called a restricted Boltzmann machine. Although the two models we study look different from their parametrizations, we show that they represent the same set of distributions on the interior of the probability simplex, and are equal up to closure. We give a semi-algebraic description of the model in terms of six binomial inequalities and obtain closed form expressions for the maximum likelihood estimates. We briefly discuss extensions to larger models.

QMDec 24, 2016
Tensor clustering with algebraic constraints gives interpretable groups of crosstalk mechanisms in breast cancer

Anna Seigal, Mariano Beguerisse-Díaz, Birgit Schoeberl et al.

We introduce a tensor-based clustering method to extract sparse, low-dimensional structure from high-dimensional, multi-indexed datasets. This framework is designed to enable detection of clusters of data in the presence of structural requirements which we encode as algebraic constraints in a linear program. Our clustering method is general and can be tailored to a variety of applications in science and industry. We illustrate our method on a collection of experiments measuring the response of genetically diverse breast cancer cell lines to an array of ligands. Each experiment consists of a cell line-ligand combination, and contains time-course measurements of the early-signalling kinases MAPK and AKT at two different ligand dose levels. By imposing appropriate structural constraints and respecting the multi-indexed structure of the data, the analysis of clusters can be optimized for biological interpretation and therapeutic understanding. We then perform a systematic, large-scale exploration of mechanistic models of MAPK-AKT crosstalk for each cluster. This analysis allows us to quantify the heterogeneity of breast cancer cell subtypes, and leads to hypotheses about the signalling mechanisms that mediate the response of the cell lines to ligands.