Christine Sinoquet

LG
3papers
135citations
Novelty17%
AI Score16

3 Papers

LGMar 31, 2021
Time Series Analysis and Modeling to Forecast: a Survey

Fatoumata Dama, Christine Sinoquet

Time series modeling for predictive purpose has been an active research area of machine learning for many years. However, no sufficiently comprehensive and meanwhile substantive survey was offered so far. This survey strives to meet this need. A unified presentation has been adopted for entire parts of this compilation. A red thread guides the reader from time series preprocessing to forecasting. Time series decomposition is a major preprocessing task, to separate nonstationary effects (the deterministic components) from the remaining stochastic constituent, assumed to be stationary. The deterministic components are predictable and contribute to the prediction through estimations or extrapolation. Fitting the most appropriate model to the remaining stochastic component aims at capturing the relationship between past and future values, to allow prediction. We cover a sufficiently broad spectrum of models while nonetheless offering substantial methodological developments. We describe three major linear parametric models, together with two nonlinear extensions, and present five categories of nonlinear parametric models. Beyond conventional statistical models, we highlight six categories of deep neural networks appropriate for time series forecasting in nonlinear framework. Finally, we enlighten new avenues of research for time series modeling and forecasting. We also report software made publicly available for the models presented.

LGFeb 24, 2021
Partially Hidden Markov Chain Linear Autoregressive model: inference and forecasting

Fatoumata Dama, Christine Sinoquet

Time series subject to change in regime have attracted much interest in domains such as econometry, finance or meteorology. For discrete-valued regimes, some models such as the popular Hidden Markov Chain (HMC) describe time series whose state process is unknown at all time-steps. Sometimes, time series are firstly labelled thanks to some annotation function. Thus, another category of models handles the case with regimes observed at all time-steps. We present a novel model which addresses the intermediate case: (i) state processes associated to such time series are modelled by Partially Hidden Markov Chains (PHMCs); (ii) a linear autoregressive (LAR) model drives the dynamics of the time series, within each regime. We describe a variant of the expection maximization (EM) algorithm devoted to PHMC-LAR model learning. We propose a hidden state inference procedure and a forecasting function that take into account the observed states when existing. We assess inference and prediction performances, and analyze EM convergence times for the new model, using simulated data. We show the benefits of using partially observed states to decrease EM convergence times. A fully labelled scheme with unreliable labels also speeds up EM. This offers promising prospects to enhance PHMC-LAR model selection. We also point out the robustness of PHMC-LAR to labelling errors in inference task, when large training datasets and moderate labelling error rates are considered. Finally, we highlight the remarkable robustness to error labelling in the prediction task, over the whole range of error rates.

LGFeb 4, 2014
A Survey on Latent Tree Models and Applications

Raphaël Mourad, Christine Sinoquet, Nevin L. Zhang et al.

In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences. The latent tree model, a particular type of probabilistic graphical models, deserves attention. Its simple structure - a tree - allows simple and efficient inference, while its latent variables capture complex relationships. In the past decade, the latent tree model has been subject to significant theoretical and methodological developments. In this review, we propose a comprehensive study of this model. First we summarize key ideas underlying the model. Second we explain how it can be efficiently learned from data. Third we illustrate its use within three types of applications: latent structure discovery, multidimensional clustering, and probabilistic inference. Finally, we conclude and give promising directions for future researches in this field.