Hermanni Hälvä

h-index3

5papers

203citations

Novelty72%

AI Score37

Ranked #94,926 of 194,257 authors (top 49%)#1,324 in ML (top 39%)

5 Papers

4.3MLNov 28, 2023

Identifiable Feature Learning for Spatial Data with Nonlinear ICA

Hermanni Hälvä, Jonathan So, Richard E. Turner et al.

Recently, nonlinear ICA has surfaced as a popular alternative to the many heuristic models used in deep representation learning and disentanglement. An advantage of nonlinear ICA is that a sophisticated identifiability theory has been developed; in particular, it has been proven that the original components can be recovered under sufficiently strong latent dependencies. Despite this general theory, practical nonlinear ICA algorithms have so far been mainly limited to data with one-dimensional latent dependencies, especially time-series data. In this paper, we introduce a new nonlinear ICA framework that employs $t$-process (TP) latent components which apply naturally to data with higher-dimensional dependency structures, such as spatial and spatio-temporal data. In particular, we develop a new learning and inference algorithm that extends variational inference methods to handle the combination of a deep neural network mixing function with the TP prior, and employs the method of inducing points for computational efficacy. On the theoretical side, we show that such TP independent components are identifiable under very general conditions. Further, Gaussian Process (GP) nonlinear ICA is established as a limit of the TP Nonlinear ICA model, and we prove that the identifiability of the latent components at this GP limit is more restricted. Namely, those components are identifiable if and only if they have distinctly different covariance kernels. Our algorithm and identifiability theorems are explored on simulated spatial data and real world spatio-temporal data.

25.5MLJun 17, 2021Code

Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA

Hermanni Hälvä, Sylvain Le Corff, Luc Lehéricy et al.

We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. While previous works have shown identifiability for specific classes of time-series models, our theorems extend this to more general temporal structures as well as to models with more complex structures such as spatial dependencies. In particular, we establish the major result that identifiability for this framework holds even in the presence of noise of unknown distribution. Finally, as an example of our framework's flexibility, we introduce the first nonlinear ICA model for time-series that combines the following very useful properties: it accounts for both nonstationarity and autocorrelation in a fully unsupervised setting; performs dimensionality reduction; models hidden states; and enables principled estimation and inference by variational maximum-likelihood.

24.1MLJun 22, 2020Code

Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary Time Series

Hermanni Hälvä, Aapo Hyvärinen

Recent advances in nonlinear Independent Component Analysis (ICA) provide a principled framework for unsupervised feature learning and disentanglement. The central idea in such works is that the latent components are assumed to be independent conditional on some observed auxiliary variables, such as the time-segment index. This requires manual segmentation of data into non-stationary segments which is computationally expensive, inaccurate and often impossible. These models are thus not fully unsupervised. We remedy these limitations by combining nonlinear ICA with a Hidden Markov Model, resulting in a model where a latent state acts in place of the observed segment-index. We prove identifiability of the proposed model for a general mixing nonlinearity, such as a neural network. We also show how maximum likelihood estimation of the model can be done using the expectation-maximization algorithm. Thus, we achieve a new nonlinear ICA framework which is unsupervised, more efficient, as well as able to model underlying temporal dynamics.

12.0MLJun 19, 2020

Independent Innovation Analysis for Nonlinear Vector Autoregressive Process

Hiroshi Morioka, Hermanni Hälvä, Aapo Hyvärinen

The nonlinear vector autoregressive (NVAR) model provides an appealing framework to analyze multivariate time series obtained from a nonlinear dynamical system. However, the innovation (or error), which plays a key role by driving the dynamics, is almost always assumed to be additive. Additivity greatly limits the generality of the model, hindering analysis of general NVAR processes which have nonlinear interactions between the innovations. Here, we propose a new general framework called independent innovation analysis (IIA), which estimates the innovations from completely general NVAR. We assume mutual independence of the innovations as well as their modulation by an auxiliary variable (which is often taken as the time index and simply interpreted as nonstationarity). We show that IIA guarantees the identifiability of the innovations with arbitrary nonlinearities, up to a permutation and component-wise invertible nonlinearities. We also propose three estimation frameworks depending on the type of the auxiliary variable. We thus provide the first rigorous identifiability result for general NVAR, as well as very general tools for learning such models.

1.8NEDec 5, 2018

Computational Graph Approach for Detection of Composite Human Activities

Niko Reunanen, Ville Könönen, Hermanni Hälvä et al.

Existing work in human activity detection classifies physical activities using a single fixed-length subset of a sensor signal. However, temporally consecutive subsets of a sensor signal are not utilized. This is not optimal for classifying physical activities (composite activities) that are composed of a temporal series of simpler activities (atomic activities). A sport consists of physical activities combined in a fashion unique to that sport. The constituent physical activities and the sport are not fundamentally different. We propose a computational graph architecture for human activity detection based on the readings of a triaxial accelerometer. The resulting model learns 1) a representation of the atomic activities of a sport and 2) to classify physical activities as compositions of the atomic activities. The proposed model, alongside with a set of baseline models, was tested for a simultaneous classification of eight physical activities (walking, nordic walking, running, soccer, rowing, bicycling, exercise bicycling and lying down). The proposed model obtained an overall mean accuracy of 77.91% (population) and 95.28% (personalized). The corresponding accuracies of the best baseline model were 73.52% and 90.03%. However, without combining consecutive atomic activities, the corresponding accuracies of the proposed model were 71.52% and 91.22%. The results show that our proposed model is accurate, outperforms the baseline models and learns to combine simple activities into complex activities. Composite activities can be classified as combinations of atomic activities. Our proposed architecture is a basis for accurate models in human activity detection.