Valentin Leplat

h-index4

12papers

136citations

Novelty47%

AI Score48

Ranked #28,871 of 194,257 authors (top 15%)#6,869 in LG (top 17%)

12 Papers

4.0OCSep 29, 2022Code

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer

Valentin Leplat, Daniil Merkulov, Aleksandr Katrutsa et al.

Classical machine learning models such as deep neural networks are usually trained by using Stochastic Gradient Descent-based (SGD) algorithms. The classical SGD can be interpreted as a discretization of the stochastic gradient flow. In this paper we propose a novel, robust and accelerated stochastic optimizer that relies on two key elements: (1) an accelerated Nesterov-like Stochastic Differential Equation (SDE) and (2) its semi-implicit Gauss-Seidel type discretization. The convergence and stability of the obtained method, referred to as NAG-GS, are first studied extensively in the case of the minimization of a quadratic function. This analysis allows us to come up with an optimal learning rate in terms of the convergence rate while ensuring the stability of NAG-GS. This is achieved by the careful analysis of the spectral radius of the iteration matrix and the covariance matrix at stationarity with respect to all hyperparameters of our method. Further, we show that NAG- GS is competitive with state-of-the-art methods such as momentum SGD with weight decay and AdamW for the training of machine learning models such as the logistic regression model, the residual networks models on standard computer vision datasets, Transformers in the frame of the GLUE benchmark and the recent Vision Transformers.

3.8LGSep 15, 2023Code

Deep Nonnegative Matrix Factorization with Beta Divergences

Valentin Leplat, Le Thi Khanh Hien, Akwum Onwunta et al.

Deep Nonnegative Matrix Factorization (deep NMF) has recently emerged as a valuable technique for extracting multiple layers of features across different scales. However, all existing deep NMF models and algorithms have primarily centered their evaluation on the least squares error, which may not be the most appropriate metric for assessing the quality of approximations on diverse datasets. For instance, when dealing with data types such as audio signals and documents, it is widely acknowledged that $β$-divergences offer a more suitable alternative. In this paper, we develop new models and algorithms for deep NMF using some $β$-divergences, with a focus on the Kullback-Leibler divergence. Subsequently, we apply these techniques to the extraction of facial features, the identification of topics within document collections, and the identification of materials within hyperspectral images.

6.9NAMar 23

Pass-efficient Randomized Algorithms for Low-rank Approximation of Quaternion Matrices

Salman Ahmadi-Asl, Malihe Nobakht Kooshkghazi, Valentin Leplat

Randomized algorithms for low-rank approximation of quaternion matrices have gained increasing attention in recent years. However, existing methods overlook pass efficiency, the ability to limit the number of passes over the input matrix-which is critical in modern computing environments dominated by communication costs. We address this gap by proposing a suite of pass-efficient randomized algorithms that let users directly trade pass budget for approximation accuracy. Our contributions include: (i) a family of arbitrary-pass randomized algorithms for low-rank approximation of quaternion matrices that operate under a user-specified number of matrix views, and (ii) a pass-efficient extension of block Krylov subspace methods that accelerates convergence for matrices with slowly decaying spectra. Furthermore, we establish spectral norm error bounds showing that the expected approximation error decays exponentially with the number of passes. Finally, we validate our framework through extensive numerical experiments and demonstrate its practical relevance across multiple applications, including quaternionic data compression, matrix completion, image super-resolution, and deep learning.

1.2NANov 10, 2025

A Provably-Correct and Robust Convex Model for Smooth Separable NMF

Junjun Pan, Valentin Leplat, Michael Ng et al.

Nonnegative matrix factorization (NMF) is a linear dimensionality reduction technique for nonnegative data, with applications such as hyperspectral unmixing and topic modeling. NMF is a difficult problem in general (NP-hard), and its solutions are typically not unique. To address these two issues, additional constraints or assumptions are often used. In particular, separability assumes that the basis vectors in the NMF are equal to some columns of the input matrix. In that case, the problem is referred to as separable NMF (SNMF) and can be solved in polynomial-time with robustness guarantees, while identifying a unique solution. However, in real-world scenarios, due to noise or variability, multiple data points may lie near the basis vectors, which SNMF does not leverage. In this work, we rely on the smooth separability assumption, which assumes that each basis vector is close to multiple data points. We explore the properties of the corresponding problem, referred to as smooth SNMF (SSNMF), and examine how it relates to SNMF and orthogonal NMF. We then propose a convex model for SSNMF and show that it provably recovers the sought-after factors, even in the presence of noise. We finally adapt an existing fast gradient method to solve this convex model for SSNMF, and show that it compares favorably with state-of-the-art methods on both synthetic and hyperspectral datasets.

7.9LGMar 27, 2024Code

Efficient Algorithms for Regularized Nonnegative Scale-invariant Low-rank Approximation Models

Jeremy E. Cohen, Valentin Leplat

Regularized nonnegative low-rank approximations, such as sparse Nonnegative Matrix Factorization or sparse Nonnegative Tucker Decomposition, form an important branch of dimensionality reduction models known for their enhanced interpretability. From a practical perspective, however, selecting appropriate regularizers and regularization coefficients, as well as designing efficient algorithms, remains challenging due to the multifactor nature of these models and the limited theoretical guidance available. This paper addresses these challenges by studying a more general model, the Homogeneous Regularized Scale-Invariant model. We prove that the scale-invariance inherent to low-rank approximation models induces an implicit regularization effect that balances solutions. This insight provides a deeper understanding of the role of regularization functions in low-rank approximation models, informs the selection of regularization hyperparameters, and enables the design of balancing strategies to accelerate the empirical convergence of optimization algorithms. Additionally, we propose a generic Majorization-Minimization (MM) algorithm capable of handling $\ell_p^p$-regularized nonnegative low-rank approximations with non-Euclidean loss functions, with convergence guarantees. Our contributions are demonstrated on sparse Nonnegative Matrix Factorization, ridge-regularized Nonnegative Canonical Polyadic Decomposition, and sparse Nonnegative Tucker Decomposition.

6.4LGJan 12, 2024Code

Block Majorization Minimization with Extrapolation and Application to $β$-NMF

Le Thi Khanh Hien, Valentin Leplat, Nicolas Gillis

We propose a Block Majorization Minimization method with Extrapolation (BMMe) for solving a class of multi-convex optimization problems. The extrapolation parameters of BMMe are updated using a novel adaptive update rule. By showing that block majorization minimization can be reformulated as a block mirror descent method, with the Bregman divergence adaptively updated at each iteration, we establish subsequential convergence for BMMe. We use this method to design efficient algorithms to tackle nonnegative matrix factorization problems with the $β$-divergences ($β$-NMF) for $β\in [1,2]$. These algorithms, which are multiplicative updates with extrapolation, benefit from our novel results that offer convergence guarantees. We also empirically illustrate the significant acceleration of BMMe for $β$-NMF through extensive experiments.

7.1LGOct 2, 2025

Market-Driven Subset Selection for Budgeted Training

Ashish Jha, Valentin Leplat, AH Phan

Training large language models on massive datasets is computationally expensive, yet empirical evidence suggests that substantial portions of training examples contribute minimally to final performance. Data subset selection addresses this inefficiency by identifying small, high-utility subsets under resource constraints. However, example utility is inherently multi-faceted, encompassing uncertainty, distributional rarity, and diversity signals that are heterogeneous and typically combined through ad hoc weighted sums lacking theoretical grounding. We propose a market-based framework that treats each training example as a tradeable contract and employs the Logarithmic Market Scoring Rule to aggregate multiple utility signals into coherent prices. Heterogeneous signals act as traders, a single liquidity parameter controls concentration versus smoothing, and topic-wise normalization ensures calibrated aggregation. Token budgets are handled explicitly through a price-per-token decision rule with an interpretable length-bias parameter. We establish theoretical connections to maximum-entropy aggregation and provide utility recovery guarantees under noisy but monotone signals. On GSM8K mathematical reasoning under strict 60k-token budgets, our selector achieves parity with strong single-signal baselines while exhibiting lower variance and incurring less than 0.1 GPU-hour overhead. On AGNews classification at 5-25\% retention rates, the market formulation delivers competitive accuracy with improved stability. Our framework unifies multi-signal data curation under fixed computational budgets for prompt-level reasoning and classification tasks.

2.6LGNov 24, 2024

Ruppert-Polyak averaging for Stochastic Order Oracle

V. N. Smirnov, K. M. Kazistova, I. A. Sudakov et al.

Black-box optimization, a rapidly growing field, faces challenges due to limited knowledge of the objective function's internal mechanisms. One promising approach to address this is the Stochastic Order Oracle Concept. This concept, similar to other Order Oracle Concepts, relies solely on relative comparisons of function values without requiring access to the exact values. This paper presents a novel, improved estimation of the covariance matrix for the asymptotic convergence of the Stochastic Order Oracle Concept. Our work surpasses existing research in this domain by offering a more accurate estimation of asymptotic convergence rate. Finally, numerical experiments validate our theoretical findings, providing strong empirical support for our proposed approach.

4.3SDOct 27, 2021

Nonnegative Tucker Decomposition with Beta-divergence for Music Structure Analysis of Audio Signals

Axel Marmoret, Florian Voorwinden, Valentin Leplat et al.

Nonnegative Tucker decomposition (NTD), a tensor decomposition model, has received increased interest in the recent years because of its ability to blindly extract meaningful patterns, in particular in Music Information Retrieval. Nevertheless, existing algorithms to compute NTD are mostly designed for the Euclidean loss. This work proposes a multiplicative updates algorithm to compute NTD with the beta-divergence loss, often considered a better loss for audio processing. We notably show how to implement efficiently the multiplicative rules using tensor algebra. Finally, we show on a music structure analysis task that unsupervised NTD fitted with beta-divergence loss outperforms earlier results obtained with the Euclidean loss.

2.3SPJul 8, 2020

Multi-Resolution Beta-Divergence NMF for Blind Spectral Unmixing

Valentin Leplat, Nicolas Gillis, Cédric Févotte

Many datasets are obtained as a resolution trade-off between two adversarial dimensions; for example between the frequency and the temporal resolutions for the spectrogram of an audio signal, and between the number of wavelengths and the spatial resolution for a hyper/multi-spectral image. To perform blind source separation using observations with different resolutions, a standard approach is to use coupled nonnegative matrix factorizations (NMF). Most previous works have focused on the least squares error measure, which is the $β$-divergence for $β= 2$. In this paper, we formulate this multi-resolution NMF problem for any $β$-divergence, and propose an algorithm based on multiplicative updates (MU). We show on numerical experiments that the MU are able to obtain high resolutions in both dimensions on two applications: (1) blind unmixing of audio spectrograms: to the best of our knowledge, this is the first time a coupled NMF model is used in this context, and (2) the fusion of hyperspectral and multispectral images: we show that the MU compete favorable with state-of-the-art algorithms in particular in the presence of non-Gaussian noise.

13.4SPJul 4, 2019

Blind Audio Source Separation with Minimum-Volume Beta-Divergence NMF

Valentin Leplat, Nicolas Gillis, Man Shun Ang

Considering a mixed signal composed of various audio sources and recorded with a single microphone, we consider on this paper the blind audio source separation problem which consists in isolating and extracting each of the sources. To perform this task, nonnegative matrix factorization (NMF) based on the Kullback-Leibler and Itakura-Saito $β$-divergences is a standard and state-of-the-art technique that uses the time-frequency representation of the signal. We present a new NMF model better suited for this task. It is based on the minimization of $β$-divergences along with a penalty term that promotes the columns of the dictionary matrix to have a small volume. Under some mild assumptions and in noiseless conditions, we prove that this model is provably able to identify the sources. In order to solve this problem, we propose multiplicative updates whose derivations are based on the standard majorization-minimization framework. We show on several numerical experiments that our new model is able to obtain more interpretable results than standard NMF models. Moreover, we show that it is able to recover the sources even when the number of sources present into the mixed signal is overestimated. In fact, our model automatically sets sources to zero in this situation, hence performs model order selection automatically.

5.4LGJan 30, 2019

Distributionally Robust and Multi-Objective Nonnegative Matrix Factorization

Nicolas Gillis, Le Thi Khanh Hien, Valentin Leplat et al.

Nonnegative matrix factorization (NMF) is a linear dimensionality reduction technique for analyzing nonnegative data. A key aspect of NMF is the choice of the objective function that depends on the noise model (or statistics of the noise) assumed on the data. In many applications, the noise model is unknown and difficult to estimate. In this paper, we define a multi-objective NMF (MO-NMF) problem, where several objectives are combined within the same NMF model. We propose to use Lagrange duality to judiciously optimize for a set of weights to be used within the framework of the weighted-sum approach, that is, we minimize a single objective function which is a weighted sum of the all objective functions. We design a simple algorithm based on multiplicative updates to minimize this weighted sum. We show how this can be used to find distributionally robust NMF (DR-NMF) solutions, that is, solutions that minimize the largest error among all objectives, using a dual approach solved via a heuristic inspired from the Frank-Wolfe algorithm. We illustrate the effectiveness of this approach on synthetic, document and audio data sets. The results show that DR-NMF is robust to our incognizance of the noise model of the NMF problem.