José Henrique de Morais Goulart

ML
h-index12
8papers
47citations
Novelty50%
AI Score37

8 Papers

LGJul 13, 2022
Majorization-minimization for Sparse Nonnegative Matrix Factorization with the $β$-divergence

Arthur Marmin, José Henrique de Morais Goulart, Cédric Févotte

This article introduces new multiplicative updates for nonnegative matrix factorization with the $β$-divergence and sparse regularization of one of the two factors (say, the activation matrix). It is well known that the norm of the other factor (the dictionary matrix) needs to be controlled in order to avoid an ill-posed formulation. Standard practice consists in constraining the columns of the dictionary to have unit norm, which leads to a nontrivial optimization problem. Our approach leverages a reparametrization of the original problem into the optimization of an equivalent scale-invariant objective function. From there, we derive block-descent majorization-minimization algorithms that result in simple multiplicative updates for either $\ell_{1}$-regularization or the more "aggressive" log-regularization. In contrast with other state-of-the-art methods, our algorithms are universal in the sense that they can be applied to any $β$-divergence (i.e., any value of $β$) and that they come with convergence guarantees. We report numerical comparisons with existing heuristic and Lagrangian methods using various datasets: face images, an audio spectrogram, hyperspectral data, and song play counts. We show that our methods obtain solutions of similar quality at convergence (similar objective values) but with significantly reduced CPU times.

MLApr 20, 2023
Hotelling Deflation on Large Symmetric Spiked Tensors

Mohamed El Amine Seddik, José Henrique de Morais Goulart, Maxime Guillaud

This paper studies the deflation algorithm when applied to estimate a low-rank symmetric spike contained in a large tensor corrupted by additive Gaussian noise. Specifically, we provide a precise characterization of the large-dimensional performance of deflation in terms of the alignments of the vectors obtained by successive rank-1 approximation and of their estimated weights, assuming non-trivial (fixed) correlations among spike components. Our analysis allows an understanding of the deflation mechanism in the presence of noise and can be exploited for designing more efficient signal estimation methods.

MLOct 28, 2023
On the Accuracy of Hotelling-Type Asymmetric Tensor Deflation: A Random Tensor Analysis

Mohamed El Amine Seddik, Maxime Guillaud, Alexis Decurninge et al.

This work introduces an asymptotic study of Hotelling-type tensor deflation in the presence of noise, in the regime of large tensor dimensions. Specifically, we consider a low-rank asymmetric tensor model of the form $\sum_{i=1}^r β_i{\mathcal{A}}_i + {\mathcal{W}}$ where $β_i\geq 0$ and the ${\mathcal{A}}_i$'s are unit-norm rank-one tensors such that $\left| \langle {\mathcal{A}}_i, {\mathcal{A}}_j \rangle \right| \in [0, 1]$ for $i\neq j$ and ${\mathcal{W}}$ is an additive noise term. Assuming that the dominant components are successively estimated from the noisy observation and subsequently subtracted, we leverage recent advances in random tensor theory in the regime of asymptotically large tensor dimensions to analytically characterize the estimated singular values and the alignment of estimated and true singular vectors at each step of the deflation procedure. Furthermore, this result can be used to construct estimators of the signal-to-noise ratios $β_i$ and the alignments between the estimated and true rank-1 signal components.

MLFeb 16, 2024
Performance Gaps in Multi-view Clustering under the Nested Matrix-Tensor Model

Hugo Lebeau, Mohamed El Amine Seddik, José Henrique de Morais Goulart

We study the estimation of a planted signal hidden in a recently introduced nested matrix-tensor model, which is an extension of the classical spiked rank-one tensor model, motivated by multi-view clustering. Prior work has theoretically examined the performance of a tensor-based approach, which relies on finding a best rank-one approximation, a problem known to be computationally hard. A tractable alternative approach consists in computing instead the best rank-one (matrix) approximation of an unfolding of the observed tensor data, but its performance was hitherto unknown. We quantify here the performance gap between these two approaches, in particular by deriving the precise algorithmic threshold of the unfolding approach and demonstrating that it exhibits a BBP-type transition behavior. This work is therefore in line with recent contributions which deepen our understanding of why tensor-based methods surpass matrix-based methods in handling structured tensor data.

MLSep 22, 2025
Whitening Spherical Gaussian Mixtures in the Large-Dimensional Regime

Mohammed Racim Moussa Boudjemaa, Alper Kalle, Xiaoyi Mai et al.

Whitening is a classical technique in unsupervised learning that can facilitate estimation tasks by standardizing data. An important application is the estimation of latent variable models via the decomposition of tensors built from high-order moments. In particular, whitening orthogonalizes the means of a spherical Gaussian mixture model (GMM), thereby making the corresponding moment tensor orthogonally decomposable, hence easier to decompose. However, in the large-dimensional regime (LDR) where data are high-dimensional and scarce, the standard whitening matrix built from the sample covariance becomes ineffective because the latter is spectrally distorted. Consequently, whitened means of a spherical GMM are no longer orthogonal. Using random matrix theory, we derive exact limits for their dot products, which are generally nonzero in the LDR. As our main contribution, we then construct a corrected whitening matrix that restores asymptotic orthogonality, allowing for performance gains in spherical GMM estimation.

MLAug 2, 2021
A Random Matrix Perspective on Random Tensors

José Henrique de Morais Goulart, Romain Couillet, Pierre Comon

Tensor models play an increasingly prominent role in many fields, notably in machine learning. In several applications, such as community detection, topic modeling and Gaussian mixture learning, one must estimate a low-rank signal from a noisy tensor. Hence, understanding the fundamental limits of estimators of that signal inevitably calls for the study of random tensors. Substantial progress has been recently achieved on this subject in the large-dimensional limit. Yet, some of the most significant among these results--in particular, a precise characterization of the abrupt phase transition (with respect to signal-to-noise ratio) that governs the performance of the maximum likelihood (ML) estimator of a symmetric rank-one model with Gaussian noise--were derived based of mean-field spin glass theory, which is not easily accessible to non-experts. In this work, we develop a sharply distinct and more elementary approach, relying on standard but powerful tools brought by years of advances in random matrix theory. The key idea is to study the spectra of random matrices arising from contractions of a given random tensor. We show how this gives access to spectral properties of the random tensor itself. For the aforementioned rank-one model, our technique yields a hitherto unknown fixed-point equation whose solution precisely matches the asymptotic performance of the ML estimator above the phase transition threshold in the third-order case. A numerical verification provides evidence that the same holds for orders 4 and 5, leading us to conjecture that, for any order, our fixed-point equation is equivalent to the known characterization of the ML estimation performance that had been obtained by relying on spin glasses. Moreover, our approach sheds light on certain properties of the ML problem landscape in large dimensions and can be extended to other models, such as asymmetric and non-Gaussian.

LGJun 29, 2021
Joint Majorization-Minimization for Nonnegative Matrix Factorization with the $β$-divergence

Arthur Marmin, José Henrique de Morais Goulart, Cédric Févotte

This article proposes new multiplicative updates for nonnegative matrix factorization (NMF) with the $β$-divergence objective function. Our new updates are derived from a joint majorization-minimization (MM) scheme, in which an auxiliary function (a tight upper bound of the objective function) is built for the two factors jointly and minimized at each iteration. This is in contrast with the classic approach in which a majorizer is derived for each factor separately. Like that classic approach, our joint MM algorithm also results in multiplicative updates that are simple to implement. They however yield a significant drop of computation time (for equally good solutions), in particular for some $β$-divergences of important applicative interest, such as the squared Euclidean distance and the Kullback-Leibler or Itakura-Saito divergences. We report experimental results using diverse datasets: face images, an audio spectrogram, hyperspectral data and song play counts. Depending on the value of $β$ and on the dataset, our joint MM approach can yield CPU time reductions from about $13\%$ to $78\%$ in comparison to the classic alternating scheme.

COJun 24, 2015
Statistical efficiency of structured cpd estimation applied to Wiener-Hammerstein modeling

José Henrique De Morais Goulart, Maxime Boizard, Rémy Boyer et al.

The computation of a structured canonical polyadic decomposition (CPD) is useful to address several important modeling problems in real-world applications. In this paper, we consider the identification of a nonlinear system by means of a Wiener-Hammerstein model, assuming a high-order Volterra kernel of that system has been previously estimated. Such a kernel, viewed as a tensor, admits a CPD with banded circulant factors which comprise the model parameters. To estimate them, we formulate specialized estimators based on recently proposed algorithms for the computation of structured CPDs. Then, considering the presence of additive white Gaussian noise, we derive a closed-form expression for the Cramer-Rao bound (CRB) associated with this estimation problem. Finally, we assess the statistical performance of the proposed estimators via Monte Carlo simulations, by comparing their mean-square error with the CRB.