Mariya Ishteva

LG
9papers
75citations
Novelty46%
AI Score46

9 Papers

OCJan 28, 2016
Weighted tensor decomposition for approximate decoupling of multivariate polynomials

Gabriel Hollander, Philippe Dreesen, Mariya Ishteva et al.

Multivariate polynomials arise in many different disciplines. Representing such a polynomial as a vector of univariate polynomials can offer useful insight, as well as more intuitive understanding. For this, techniques based on tensor methods are known, but these have only been studied in the exact case. In this paper, we generalize an existing method to the noisy case, by introducing a weight factor in the tensor decomposition. Finally, we apply the proposed weighted decoupling algorithm in the domain of system identification, and observe smaller model errors.

NAJan 29, 2019
Decoupling multivariate polynomials: interconnections between tensorizations

Konstantin Usevich, Philippe Dreesen, Mariya Ishteva

Decoupling multivariate polynomials is useful for obtaining an insight into the workings of a nonlinear mapping, performing parameter reduction, or approximating nonlinear functions. Several different tensor-based approaches have been proposed independently for this task, involving different tensor representations of the functions, and ultimately leading to a canonical polyadic decomposition. We first show that the involved tensors are related by a linear transformation, and that their CP decompositions and uniqueness properties are closely related. This connection provides a way to better assess which of the methods should be favored in certain problem settings, and may be a starting point to unify the two approaches. Second, we show that taking into account the previously ignored intrinsic structure in the tensor decompositions improves the uniqueness properties of the decompositions and thus enlarges the applicability range of the methods.

NAMay 22, 2018
Decoupling multivariate functions using second-order information and tensors

Philippe Dreesen, Jeroen De Geeter, Mariya Ishteva

The power of multivariate functions is their ability to model a wide variety of phenomena, but have the disadvantages that they lack an intuitive or interpretable representation, and often require a (very) large number of parameters. We study decoupled representations of multivariate vector functions, which are linear combinations of univariate functions in linear combinations of the input variables. This model structure provides a description with fewer parameters, and reveals the internal workings in a simpler way, as the nonlinearities are one-to-one functions. In earlier work, a tensor-based method was developed for performing this decomposition by using first-order derivative information. In this article, we generalize this method and study how the use of second-order derivative information can be incorporated. By doing this, we are able to push the method towards more involved configurations, while preserving uniqueness of the underlying tensor decompositions. Furthermore, even for some non-identifiable structures, the method seems to return a valid decoupled representation. These results are a step towards more general data-driven and noise-robust tensor-based framework for computing decoupled function representations.

32.5SYApr 12
Tensor-based Multi-layer Decoupling

Joppe De Jonghe, Konstantin Usevich, Philippe Dreesen et al.

The decoupling of multivariate functions is a powerful modeling paradigm for learning multivariate input-output relations from data. For the single-layer case, established CPD-based methods are available, but the multi-layer case remained largely unexplored. This work introduces a tensor-based framework for multi-layer decoupling, which is based on ParaTuck-type tensor decompositions and constrained optimization. We provide theoretical justification behind the considered tensor decompositions and parameterizations. Furthermore, we formulate a structured coupled matrix-tensor factorization that incorporates both Jacobian and function evaluations, together with a bilevel optimization approach for adaptively balancing first- and zeroth-order information. The feasibility of the proposed methodology is illustrated on synthetic systems, a nonlinear system identification benchmark and neural network compression.

9.8MLMar 26
Adaptive Subspace Modeling With Functional Tucker Decomposition

Noah Steidle, Joppe De Jonghe, Mariya Ishteva

Tensors provide a structured representation for multidimensional data, yet discretization can obscure important information when such data originates from continuous processes. We address this limitation by introducing a functional Tucker decomposition (FTD) that embeds mode-wise continuity constraints directly into the decomposition. The FTD employs reproducing kernel Hilbert spaces (RKHS) to model continuous modes without requiring an a-priori basis, while preserving the multi-linear subspace structure of the Tucker model. Through RKHS-driven representation, the model yields adaptive and expressive factor descriptions that enable targeted modeling of subspaces. The value of this approach is demonstrated in domain-variant tensor classification. In particular, we illustrate its effectiveness with classification tasks in hyperspectral imaging and multivariate time series analysis, highlighting the benefits of combining structural decomposition with functional adaptability.

2.1LGMay 11
Robust Basis Spline Decoupling for the Compression of Transformer Models

Joppe De Jonghe, Van Tien Pham, Mariya Ishteva

Decoupling is a powerful modeling paradigm for representing multivariate functions as compositions of linear transformations and univariate nonlinear functions. A single-layer decoupling can be viewed as a fully connected neural network with a single hidden layer and flexible activation functions, providing a direct link with neural networks. Because of this, the use of decoupling methods has gained increasing attention in neural network domains, particularly compression, since it enables structured approximations with reduced parameter complexity. Existing tensor-based decoupling methods typically rely on polynomial or piecewise-linear parameterizations of the internal nonlinear functions, which can suffer from numerical instability or limited expressiveness. In this work, we introduce a B-spline-based decoupling framework that generalizes these existing approaches. By exploiting the local support and flexible smoothness control of B-splines, the proposed formulation yields a more numerically stable and expressive representation. We derive a constrained coupled matrix-tensor factorization and propose a robust alternating least-squares algorithm, called R-CMTF-BSD, incorporating normalization and Tikhonov regularization. The proposed method is validated through experiments on synthetic data and transformer model compression. Results on the Vision and Swin Transformer architectures demonstrate that B-spline decoupling enables substantial parameter reduction while maintaining competitive accuracy, making the R-CMTF-BSD algorithm a promising tool for structured neural network compression.

NASep 26, 2016
Modeling Parallel Wiener-Hammerstein Systems Using Tensor Decomposition of Volterra Kernels

Philippe Dreesen, David Westwick, Johan Schoukens et al.

Providing flexibility and user-interpretability in nonlinear system identification can be achieved by means of block-oriented methods. One of such block-oriented system structures is the parallel Wiener-Hammerstein system, which is a sum of Wiener-Hammerstein branches, consisting of static nonlinearities sandwiched between linear dynamical blocks. Parallel Wiener-Hammerstein models have more descriptive power than their single-branch counterparts, but their identification is a non-trivial task that requires tailored system identification methods. In this work, we will tackle the identification problem by performing a tensor decomposition of the Volterra kernels obtained from the nonlinear system. We illustrate how the parallel Wiener-Hammerstein block-structure gives rise to a joint tensor decomposition of the Volterra kernels with block-circulant structured factors. The combination of Volterra kernels and tensor methods is a fruitful way to tackle the parallel Wiener-Hammerstein system identification task. In simulation experiments, we were able to reconstruct very accurately the underlying blocks under noisy conditions.

LGOct 16, 2012
A Spectral Algorithm for Latent Junction Trees

Ankur P. Parikh, Le Song, Mariya Ishteva et al.

Latent variable models are an elegant framework for capturing rich probabilistic dependencies in many applications. However, current approaches typically parametrize these models using conditional probability tables, and learning relies predominantly on local search heuristics such as Expectation Maximization. Using tensor algebra, we propose an alternative parameterization of latent variable models (where the model structures are junction trees) that still allows for computation of marginals among observed variables. While this novel representation leads to a moderate increase in the number of parameters for junction trees of low treewidth, it lets us design a local-minimum-free algorithm for learning this parameterization. The main computation of the algorithm involves only tensor operations and SVDs which can be orders of magnitude faster than EM algorithms for large datasets. To our knowledge, this is the first provably consistent parameter learning technique for a large class of low-treewidth latent graphical models beyond trees. We demonstrate the advantages of our method on synthetic and real datasets.

LGOct 3, 2012
Unfolding Latent Tree Structures using 4th Order Tensors

Mariya Ishteva, Haesun Park, Le Song

Discovering the latent structure from many observed variables is an important yet challenging learning task. Existing approaches for discovering latent structures often require the unknown number of hidden states as an input. In this paper, we propose a quartet based approach which is \emph{agnostic} to this number. The key contribution is a novel rank characterization of the tensor associated with the marginal distribution of a quartet. This characterization allows us to design a \emph{nuclear norm} based test for resolving quartet relations. We then use the quartet test as a subroutine in a divide-and-conquer algorithm for recovering the latent tree structure. Under mild conditions, the algorithm is consistent and its error probability decays exponentially with increasing sample size. We demonstrate that the proposed approach compares favorably to alternatives. In a real world stock dataset, it also discovers meaningful groupings of variables, and produces a model that fits the data better.