LGSep 29, 2024
KODA: A Data-Driven Recursive Model for Time Series Forecasting and Data Assimilation using Koopman OperatorsAshutosh Singh, Ashish Singh, Tales Imbiriba et al.
Approaches based on Koopman operators have shown great promise in forecasting time series data generated by complex nonlinear dynamical systems (NLDS). Although such approaches are able to capture the latent state representation of a NLDS, they still face difficulty in long term forecasting when applied to real world data. Specifically many real-world NLDS exhibit time-varying behavior, leading to nonstationarity that is hard to capture with such models. Furthermore they lack a systematic data-driven approach to perform data assimilation, that is, exploiting noisy measurements on the fly in the forecasting task. To alleviate the above issues, we propose a Koopman operator-based approach (named KODA - Koopman Operator with Data Assimilation) that integrates forecasting and data assimilation in NLDS. In particular we use a Fourier domain filter to disentangle the data into a physical component whose dynamics can be accurately represented by a Koopman operator, and residual dynamics that represents the local or time varying behavior that are captured by a flexible and learnable recursive model. We carefully design an architecture and training criterion that ensures this decomposition lead to stable and long-term forecasts. Moreover, we introduce a course correction strategy to perform data assimilation with new measurements at inference time. The proposed approach is completely data-driven and can be learned end-to-end. Through extensive experimental comparisons we show that KODA outperforms existing state of the art methods on multiple time series benchmarks such as electricity, temperature, weather, lorenz 63 and duffing oscillator demonstrating its superior performance and efficacy along the three tasks a) forecasting, b) data assimilation and c) state prediction.
LGJun 20, 2025
Identifiability of Deep Polynomial Neural NetworksKonstantin Usevich, Ricardo Borsoi, Clara Dérand et al.
Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degrees and layer widths in achieving identifiability. As special cases, we show that architectures with non-increasing layer widths are generically identifiable under mild conditions, while encoder-decoder networks are identifiable when the decoder widths do not grow too rapidly compared to the activation degrees. Our proofs are constructive and center on a connection between deep PNNs and low-rank tensor decompositions, and Kruskal-type uniqueness theorems. We also settle an open conjecture on the dimension of PNN's neurovarieties, and provide new bounds on the activation degrees required for it to reach the expected dimension.
LGAug 25, 2025
Low-Rank Tensor Decompositions for the Theory of Neural NetworksRicardo Borsoi, Konstantin Usevich, Marianne Clausel
The groundbreaking performance of deep neural networks (NNs) promoted a surge of interest in providing a mathematical basis to deep learning theory. Low-rank tensor decompositions are specially befitting for this task due to their close connection to NNs and their rich theoretical results. Different tensor decompositions have strong uniqueness guarantees, which allow for a direct interpretation of their factors, and polynomial time algorithms have been proposed to compute them. Through the connections between tensors and NNs, such results supported many important advances in the theory of NNs. In this review, we show how low-rank tensor methods--which have been a core tool in the signal processing and machine learning communities--play a fundamental role in theoretically explaining different aspects of the performance of deep NNs, including their expressivity, algorithmic learnability and computational hardness, generalization, and identifiability. Our goal is to give an accessible overview of existing approaches (developed by different communities, ranging from computer science to mathematics) in a coherent and unified way, and to open a broader perspective on the use of low-rank tensor decompositions for the theory of deep NNs.
LGAug 25, 2025
Riemannian Change Point Detection on Manifolds with Robust Centroid EstimationXiuheng Wang, Ricardo Borsoi, Arnaud Breloy et al.
Non-parametric change-point detection in streaming time series data is a long-standing challenge in signal processing. Recent advancements in statistics and machine learning have increasingly addressed this problem for data residing on Riemannian manifolds. One prominent strategy involves monitoring abrupt changes in the center of mass of the time series. Implemented in a streaming fashion, this strategy, however, requires careful step size tuning when computing the updates of the center of mass. In this paper, we propose to leverage robust centroid on manifolds from M-estimation theory to address this issue. Our proposal consists of comparing two centroid estimates: the classical Karcher mean (sensitive to change) versus one defined from Huber's function (robust to change). This comparison leads to the definition of a test statistic whose performance is less sensitive to the underlying estimation method. We propose a stochastic Riemannian optimization algorithm to estimate both robust centroids efficiently. Experiments conducted on both simulated and real-world data across two representative manifolds demonstrate the superior performance of our proposed method.
ASApr 19, 2021
Robust parameter design for Wiener-based binaural noise reduction methods in hearing aidsDiego M. Carmo, Ricardo Borsoi, Márcio H. Costa
This work presents a method for designing the weighting parameter required by Wiener-based binaural noise reduction methods. This parameter establishes the desired tradeoff between noise reduction and binaural cue preservation in hearing aid applications. The proposed strategy was specially derived for the preservation of interaural level difference, interaural time difference and interaural coherence binaural cues. It is defined as a function of the average input noise power at the microphones, providing robustness against the influence of joint changes in noise and speech power (Lombard effect), as well as to signal to noise ratio (SNR) variations. A theoretical framework, based on the mathematical definition of the homogeneity degree, is presented and applied to a generic augmented Wiener-based cost function. The theoretical insights obtained are supported bycomputational simulations and psychoacoustic experiments using the multichannel Wiener filter with interaural transfer function preservation technique (MWF-ITF), as a case study. Statistical analysis indicates that the proposed dynamic structure for the weighting parameter and the design method of its fixed part provide significant robustness against changes in the original binaural cues of both speech and residual noise, at the cost of a small decrease in the noise reduction performance, as compared to the use of a purely fixed weighting parameter.