SPApr 6, 2024
Time topological analysis of EEG using signature theoryStéphane Chrétien, Ben Gao, Astrid Thebault-Guiochon et al.
Anomaly detection in multivariate signals is a task of paramount importance in many disciplines (epidemiology, finance, cognitive sciences and neurosciences, oncology, etc.). In this perspective, Topological Data Analysis (TDA) offers a battery of "shape" invariants that can be exploited for the implementation of an effective detection scheme. Our contribution consists of extending the constructions presented in \cite{chretienleveraging} on the construction of simplicial complexes from the Signatures of signals and their predictive capacities, rather than the use of a generic distance as in \cite{petri2014homological}. Signature theory is a new theme in Machine Learning arXiv:1603.03788 stemming from recent work on the notions of Rough Paths developed by Terry Lyons and his team \cite{lyons2002system} based on the formalism introduced by Chen \cite{chen1957integration}. We explore in particular the detection of changes in topology, based on tracking the evolution of homological persistence and the Betti numbers associated with the complex introduced in \cite{chretienleveraging}. We apply our tools for the analysis of brain signals such as EEG to detect precursor phenomena to epileptic seizures.
LGNov 16, 2025
Conformal Online Learning of Deep Koopman Linear EmbeddingsBen Gao, Jordan Patracone, Stéphane Chrétien et al.
We introduce Conformal Online Learning of Koopman embeddings (COLoKe), a novel framework for adaptively updating Koopman-invariant representations of nonlinear dynamical systems from streaming data. Our modeling approach combines deep feature learning with multistep prediction consistency in the lifted space, where the dynamics evolve linearly. To prevent overfitting, COLoKe employs a conformal-style mechanism that shifts the focus from evaluating the conformity of new states to assessing the consistency of the current Koopman model. Updates are triggered only when the current model's prediction error exceeds a dynamically calibrated threshold, allowing selective refinement of the Koopman operator and embedding. Empirical results on benchmark dynamical systems demonstrate the effectiveness of COLoKe in maintaining long-term predictive accuracy while significantly reducing unnecessary updates and avoiding overfitting.
CLJul 25, 2025
SigBERT: Combining Narrative Medical Reports and Rough Path Signature Theory for Survival Risk Estimation in OncologyPaul Minchella, Loïc Verlingue, Stéphane Chrétien et al.
Electronic medical reports (EHR) contain a vast amount of information that can be leveraged for machine learning applications in healthcare. However, existing survival analysis methods often struggle to effectively handle the complexity of textual data, particularly in its sequential form. Here, we propose SigBERT, an innovative temporal survival analysis framework designed to efficiently process a large number of clinical reports per patient. SigBERT processes timestamped medical reports by extracting and averaging word embeddings into sentence embeddings. To capture temporal dynamics from the time series of sentence embedding coordinates, we apply signature extraction from rough path theory to derive geometric features for each patient, which significantly enhance survival model performance by capturing complex temporal dynamics. These features are then integrated into a LASSO-penalized Cox model to estimate patient-specific risk scores. The model was trained and evaluated on a real-world oncology dataset from the Léon Bérard Center corpus, with a C-index score of 0.75 (sd 0.014) on the independent test cohort. SigBERT integrates sequential medical data to enhance risk estimation, advancing narrative-based survival analysis.
QMJun 10, 2025
Detecting malignant dynamics on very few blood sample using signature coefficientsRémi Vaucher, Stéphane Chrétien
Recent discoveries have suggested that the promising avenue of using circulating tumor DNA (ctDNA) levels in blood samples provides reasonable accuracy for cancer monitoring, with extremely low burden on the patient's side. It is known that the presence of ctDNA can result from various mechanisms leading to DNA release from cells, such as apoptosis, necrosis or active secretion. One key idea in recent cancer monitoring studies is that monitoring the dynamics of ctDNA levels might be sufficient for early multi-cancer detection. This interesting idea has been turned into commercial products, e.g. in the company named GRAIL. In the present work, we propose to explore the use of Signature theory for detecting aggressive cancer tumors based on the analysis of blood samples. Our approach combines tools from continuous time Markov modelling for the dynamics of ctDNA levels in the blood, with Signature theory for building efficient testing procedures. Signature theory is a topic of growing interest in the Machine Learning community (see Chevyrev2016 and Fermanian2021), which is now recognised as a powerful feature extraction tool for irregularly sampled signals. The method proposed in the present paper is shown to correctly address the challenging problem of overcoming the inherent data scarsity due to the extremely small number of blood samples per patient. The relevance of our approach is illustrated with extensive numerical experiments that confirm the efficiency of the proposed pipeline.
OCJan 16, 2024
Registration of algebraic varieties using Riemannian optimizationFlorentin Goyens, Coralia Cartis, Stéphane Chrétien
We consider the point cloud registration problem, the task of finding a transformation between two point clouds that represent the same object but are expressed in different coordinate systems. Our approach is not based on a point-to-point correspondence, matching every point in the source point cloud to a point in the target point cloud. Instead, we assume and leverage a low-dimensional nonlinear geometric structure of the data. Firstly, we approximate each point cloud by an algebraic variety (a set defined by finitely many polynomial equations). This is done by solving an optimization problem on the Grassmann manifold, using a connection between algebraic varieties and polynomial bases. Secondly, we solve an optimization problem on the orthogonal group to find the transformation (rotation $+$ translation) which makes the two algebraic varieties overlap. We use second-order Riemannian optimization methods for the solution of both steps. Numerical experiments on real and synthetic data are provided, with encouraging results. Our approach is particularly useful when the two point clouds describe different parts of an objects (which may not even be overlapping), on the condition that the surface of the object may be well approximated by a set of polynomial equations. The first procedure -- the approximation -- is of independent interest, as it can be used for denoising data that belongs to an algebraic variety. We provide statistical guarantees for the estimation error of the denoising using Stein's unbiased estimator.
MLMay 13, 2023
Convergence and scaling of Boolean-weight optimization for hardware reservoirsLouis Andreoli, Stéphane Chrétien, Xavier Porte et al.
Hardware implementation of neural network are an essential step to implement next generation efficient and powerful artificial intelligence solutions. Besides the realization of a parallel, efficient and scalable hardware architecture, the optimization of the system's extremely large parameter space with sampling-efficient approaches is essential. Here, we analytically derive the scaling laws for highly efficient Coordinate Descent applied to optimizing the readout layer of a random recurrently connection neural network, a reservoir. We demonstrate that the convergence is exponential and scales linear with the network's number of neurons. Our results perfectly reproduce the convergence and scaling of a large-scale photonic reservoir implemented in a proof-of-concept experiment. Our work therefore provides a solid foundation for such optimization in hardware networks, and identifies future directions that are promising for optimizing convergence speed during learning leveraging measures of a neural network's amplitude statistics and the weight update rule.
NEMar 27, 2020
Boolean learning under noise-perturbations in hardware neural networksLouis Andreoli, Xavier Porte, Stéphane Chrétien et al.
A high efficiency hardware integration of neural networks benefits from realizing nonlinearity, network connectivity and learning fully in a physical substrate. Multiple systems have recently implemented some or all of these operations, yet the focus was placed on addressing technological challenges. Fundamental questions regarding learning in hardware neural networks remain largely unexplored. Noise in particular is unavoidable in such architectures, and here we investigate its interaction with a learning algorithm using an opto-electronic recurrent neural network. We find that noise strongly modifies the system's path during convergence, and surprisingly fully decorrelates the final readout weight matrices. This highlights the importance of understanding architecture, noise and learning algorithm as interacting players, and therefore identifies the need for mathematical tools for noisy, analogue system optimization.
LGMar 11, 2019
Revisiting clustering as matrix factorisation on the Stiefel manifoldStéphane Chrétien, Benjamin Guedj
This paper studies clustering for possibly high dimensional data (e.g. images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework. Our approach leverages the well known Burer-Monteiro factorisation strategy from large scale optimisation, in the context of low rank estimation. Moreover, our Burer-Monteiro factors are shown to lie on a Stiefel manifold. We propose a new generalized Bayesian estimator for this problem and prove novel prediction bounds for clustering. We also devise a componentwise Langevin sampler on the Stiefel manifold to compute this estimator.
AIJun 25, 2017
Finding optimal finite biological sequences over finite alphabets: the OptiFin toolboxRégis Garnier, Christophe Guyeux, Stéphane Chrétien
In this paper, we present a toolbox for a specific optimization problem that frequently arises in bioinformatics or genomics. In this specific optimisation problem, the state space is a set of words of specified length over a finite alphabet. To each word is associated a score. The overall objective is to find the words which have the lowest possible score. This type of general optimization problem is encountered in e.g 3D conformation optimisation for protein structure prediction, or largest core genes subset discovery based on best supported phylogenetic tree for a set of species. In order to solve this problem, we propose a toolbox that can be easily launched using MPI and embeds 3 well-known metaheuristics. The toolbox is fully parametrized and well documented. It has been specifically designed to be easy modified and possibly improved by the user depending on the application, and does not require to be a computer scientist. We show that the toolbox performs very well on two difficult practical problems.
STJun 29, 2016
Small coherence implies the weak Null Space PropertyStéphane Chrétien, Zhen Wai Olivier Ho
In the Compressed Sensing community, it is well known that given a matrix $X \in \mathbb R^{n\times p}$ with $\ell_2$ normalized columns, the Restricted Isometry Property (RIP) implies the Null Space Property (NSP). It is also well known that a small Coherence $μ$ implies a weak RIP, i.e. the singular values of $X_T$ lie between $1-δ$ and $1+δ$ for "most" index subsets $T \subset \{1,\ldots,p\}$ with size governed by $μ$ and $δ$. In this short note, we show that a small Coherence implies a weak Null Space Property, i.e. $\Vert h_T\Vert_2 \le C \ \Vert h_{T^c}\Vert_1/\sqrt{s}$ for most $T \subset \{1,\ldots,p\}$ with cardinality $|T|\le s$. We moreover prove some singular value perturbation bounds that may also prove useful for other applications.
MLJun 29, 2016
A Semi-Definite Programming approach to low dimensional embedding for unsupervised clusteringStéphane Chrétien, Clément Dombry, Adrien Faivre
This paper proposes a variant of the method of Guédon and Verhynin for estimating the cluster matrix in the Mixture of Gaussians framework via Semi-Definite Programming. A clustering oriented embedding is deduced from this estimate. The procedure is suitable for very high dimensional data because it is based on pairwise distances only. Theoretical garantees are provided and an eigenvalue optimisation approach is proposed for computing the embedding. The performance of the method is illustrated via Monte Carlo experiements and comparisons with other embeddings from the literature.
STOct 17, 2012
Mixture model for designs in high dimensional regression and the LASSOMohamed Ibrahim Assoweh, Emmanuel Caron, Stéphane Chrétien
The LASSO is a recent technique for variable selection in the regression model \bean y & = & Xβ+ z, \eean where $X\in \R^{n\times p}$ and $z$ is a centered gaussian i.i.d. noise vector $\mathcal N(0,σ^2I)$. The LASSO has been proved to achieve remarkable properties such as exact support recovery of sparse vectors when the columns are sufficently incoherent and low prediction error under even less stringent conditions. However, many matrices do not satisfy small coherence in practical applications and the LASSO estimator may thus suffer from what is known as the slow rate regime. The goal of the present paper is to study the LASSO from a slightly different perspective by proposing a mixture model for the design matrix which is able to capture in a natural way the potentially clustered nature of the columns in many practical situations. In this model, the columns of the design matrix are drawn from a Gaussian mixture model. Instead of requiring incoherence for the design matrix $X$, we only require incoherence of the much smaller matrix of the mixture's centers. Our main result states that $Xβ$ can be estimated with the same precision as for incoherent designs except for a correction term depending on the maximal variance in the mixture model.