LGNov 28, 2022
You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs TicketsTianjin Huang, Tianlong Chen, Meng Fang et al.
Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i.e., untrained networks). However, the presence of such untrained subnetworks in graph neural networks (GNNs) still remains mysterious. In this paper we carry out the first-of-its-kind exploration of discovering matching untrained GNNs. With sparsity as the core tool, we can find \textit{untrained sparse subnetworks} at the initialization, that can match the performance of \textit{fully trained dense} GNNs. Besides this already encouraging finding of comparable performance, we show that the found untrained subnetworks can substantially mitigate the GNN over-smoothing problem, hence becoming a powerful tool to enable deeper GNNs without bells and whistles. We also observe that such sparse untrained subnetworks have appealing performance in out-of-distribution detection and robustness of input perturbations. We evaluate our method across widely-used GNN architectures on various popular datasets including the Open Graph Benchmark (OGB).
LGAug 23, 2022
Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference CostLu Yin, Shiwei Liu, Meng Fang et al.
Lottery tickets (LTs) is able to discover accurate and sparse subnetworks that could be trained in isolation to match the performance of dense networks. Ensemble, in parallel, is one of the oldest time-proven tricks in machine learning to improve performance by combining the output of multiple independent models. However, the benefits of ensemble in the context of LTs will be diluted since ensemble does not directly lead to stronger sparse subnetworks, but leverages their predictions for a better decision. In this work, we first observe that directly averaging the weights of the adjacent learned subnetworks significantly boosts the performance of LTs. Encouraged by this observation, we further propose an alternative way to perform an 'ensemble' over the subnetworks identified by iterative magnitude pruning via a simple interpolating strategy. We call our method Lottery Pools. In contrast to the naive ensemble which brings no performance gains to each single subnetwork, Lottery Pools yields much stronger sparse subnetworks than the original LTs without requiring any extra training or inference cost. Across various modern architectures on CIFAR-10/100 and ImageNet, we show that our method achieves significant performance gains in both, in-distribution and out-of-distribution scenarios. Impressively, evaluated with VGG-16 and ResNet-18, the produced sparse subnetworks outperform the original LTs by up to 1.88% on CIFAR-100 and 2.36% on CIFAR-100-C; the resulting dense network surpasses the pre-trained dense-model up to 2.22% on CIFAR-100 and 2.38% on CIFAR-100-C.
LGMay 30, 2022
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network TrainingLu Yin, Vlado Menkovski, Meng Fang et al.
Recent works on sparse neural network training (sparse training) have shown that a compelling trade-off between performance and efficiency can be achieved by training intrinsically sparse neural networks from scratch. Existing sparse training methods usually strive to find the best sparse subnetwork possible in one single run, without involving any expensive dense or pre-training steps. For instance, dynamic sparse training (DST), is capable of reaching a competitive performance of dense training by iteratively evolving the sparse topology during the course of training. In this paper, we argue that it is better to allocate the limited resources to create multiple low-loss sparse subnetworks and superpose them into a stronger one, instead of allocating all resources entirely to find an individual subnetwork. To achieve this, two desiderata are required: (1) efficiently producing many low-loss subnetworks, the so-called cheap tickets, within one training process limited to the standard training time used in dense training; (2) effectively superposing these cheap tickets into one stronger subnetwork. To corroborate our conjecture, we present a novel sparse training approach, termed Sup-tickets, which can satisfy the above two desiderata concurrently in a single sparse-to-sparse training process. Across various modern architectures on CIFAR-10/100 and ImageNet, we show that Sup-tickets integrates seamlessly with the existing sparse training methods and demonstrates consistent performance improvement.
LGOct 26, 2022
Comparison of neural closure models for discretised PDEsHugo Melchers, Daan Crommelin, Barry Koren et al.
Neural closure models have recently been proposed as a method for efficiently approximating small scales in multiscale systems with neural networks. The choice of loss function and associated training procedure has a large effect on the accuracy and stability of the resulting neural closure model. In this work, we systematically compare three distinct procedures: "derivative fitting", "trajectory fitting" with discretise-then-optimise, and "trajectory fitting" with optimise-then-discretise. Derivative fitting is conceptually the simplest and computationally the most efficient approach and is found to perform reasonably well on one of the test problems (Kuramoto-Sivashinsky) but poorly on the other (Burgers). Trajectory fitting is computationally more expensive but is more robust and is therefore the preferred approach. Of the two trajectory fitting procedures, the discretise-then-optimise approach produces more accurate models than the optimise-then-discretise approach. While the optimise-then-discretise approach can still produce accurate models, care must be taken in choosing the length of the trajectories used for training, in order to train the models on long-term behaviour while still producing reasonably accurate gradients during training. Two existing theorems are interpreted in a novel way that gives insight into the long-term accuracy of a neural closure model based on how accurate it is in the short term.
LGApr 4, 2023
Equivariant Parameter Sharing for Porous Crystalline MaterialsMarko Petković, Pablo Romero-Marimon, Vlado Menkovski et al.
Efficiently predicting properties of porous crystalline materials has great potential to accelerate the high throughput screening process for developing new materials, as simulations carried out using first principles model are often computationally expensive. To effectively make use of Deep Learning methods to model these materials, we need to utilize the symmetries present in the crystals, which are defined by their space group. Existing methods for crystal property prediction either have symmetry constraints that are too restrictive or only incorporate symmetries between unit cells. In addition, these models do not explicitly model the porous structure of the crystal. In this paper, we develop a model which incorporates the symmetries of the unit cell of a crystal in its architecture and explicitly models the porous structure. We evaluate our model by predicting the heat of adsorption of CO$_2$ for different configurations of the mordenite zeolite. Our results confirm that our method performs better than existing methods for crystal property prediction and that the inclusion of pores results in a more efficient model.
CLOct 30, 2023
KeyGen2Vec: Learning Document Embedding via Multi-label Keyword Generation in Question-AnsweringIftitahu Ni'mah, Samaneh Khoshrou, Vlado Menkovski et al.
Representing documents into high dimensional embedding space while preserving the structural similarity between document sources has been an ultimate goal for many works on text representation learning. Current embedding models, however, mainly rely on the availability of label supervision to increase the expressiveness of the resulting embeddings. In contrast, unsupervised embeddings are cheap, but they often cannot capture implicit structure in target corpus, particularly for samples that come from different distribution with the pretraining source. Our study aims to loosen up the dependency on label supervision by learning document embeddings via Sequence-to-Sequence (Seq2Seq) text generator. Specifically, we reformulate keyphrase generation task into multi-label keyword generation in community-based Question Answering (cQA). Our empirical results show that KeyGen2Vec in general is superior than multi-label keyword classifier by up to 14.7% based on Purity, Normalized Mutual Information (NMI), and F1-Score metrics. Interestingly, although in general the absolute advantage of learning embeddings through label supervision is highly positive across evaluation datasets, KeyGen2Vec is shown to be competitive with classifier that exploits topic label supervision in Yahoo! cQA with larger number of latent topic labels.
LGNov 17, 2022
Neural Langevin Dynamics: towards interpretable Neural Stochastic Differential EquationsSimon M. Koop, Mark A. Peletier, Jacobus W. Portegies et al.
Neural Stochastic Differential Equations (NSDE) have been trained as both Variational Autoencoders, and as GANs. However, the resulting Stochastic Differential Equations can be hard to interpret or analyse due to the generic nature of the drift and diffusion fields. By restricting our NSDE to be of the form of Langevin dynamics, and training it as a VAE, we obtain NSDEs that lend themselves to more elaborate analysis and to a wider range of visualisation techniques than a generic NSDE. More specifically, we obtain an energy landscape, the minima of which are in one-to-one correspondence with latent states underlying the used data. This not only allows us to detect states underlying the data dynamics in an unsupervised manner, but also to infer the distribution of time spent in each state according to the learned SDE. More in general, restricting an NSDE to Langevin dynamics enables the use of a large set of tools from computational molecular dynamics for the analysis of the obtained results.
QMOct 2, 2022
Towards Learned Simulators for Cell MigrationKoen Minartz, Yoeri Poels, Vlado Menkovski
Simulators driven by deep learning are gaining popularity as a tool for efficiently emulating accurate but expensive numerical simulators. Successful applications of such neural simulators can be found in the domains of physics, chemistry, and structural biology, amongst others. Likewise, a neural simulator for cellular dynamics can augment lab experiments and traditional computational methods to enhance our understanding of a cell's interaction with its physical environment. In this work, we propose an autoregressive probabilistic model that can reproduce spatiotemporal dynamics of single cell migration, traditionally simulated with the Cellular Potts model. We observe that standard single-step training methods do not only lead to inconsistent rollout stability, but also fail to accurately capture the stochastic aspects of the dynamics, and we propose training strategies to mitigate these issues. Our evaluation on two proof-of-concept experimental scenarios shows that neural methods have the potential to faithfully simulate stochastic cellular dynamics at least an order of magnitude faster than a state-of-the-art implementation of the Cellular Potts model.
LGFeb 11
LOREN: Low Rank-Based Code-Rate Adaptation in Neural ReceiversBram Van Bolderik, Vlado Menkovski, Sonia Heemstra de Groot et al.
Neural network based receivers have recently demonstrated superior system-level performance compared to traditional receivers. However, their practicality is limited by high memory and power requirements, as separate weight sets must be stored for each code rate. To address this challenge, we propose LOREN, a Low Rank-Based Code-Rate Adaptation Neural Receiver that achieves adaptability with minimal overhead. LOREN integrates lightweight low rank adaptation adapters (LOREN adapters) into convolutional layers, freezing a shared base network while training only small adapters per code rate. An end-to-end training framework over 3GPP CDL channels ensures robustness across realistic wireless environments. LOREN achieves comparable or superior performance relative to fully retrained base neural receivers. The hardware implementation of LOREN in 22nm technology shows more than 65% savings in silicon area and up to 15% power reduction when supporting three code rates.
LGSep 2, 2024
Topological degree as a discrete diagnostic for disentanglement, with applications to the $Δ$VAEMahefa Ratsisetraina Ravelonanosy, Vlado Menkovski, Jacobus W. Portegies
We investigate the ability of Diffusion Variational Autoencoder ($Δ$VAE) with unit sphere $\mathcal{S}^2$ as latent space to capture topological and geometrical structure and disentangle latent factors in datasets. For this, we introduce a new diagnostic of disentanglement: namely the topological degree of the encoder, which is a map from the data manifold to the latent space. By using tools from homology theory, we derive and implement an algorithm that computes this degree. We use the algorithm to compute the degree of the encoder of models that result from the training procedure. Our experimental results show that the $Δ$VAE achieves relatively small LSBD scores, and that regardless of the degree after initialization, the degree of the encoder after training becomes $-1$ or $+1$, which implies that the resulting encoder is at least homotopic to a homeomorphism.
QMNov 29, 2023
Description Generation using Variational Auto-Encoders for precursor microRNAMarko Petković, Vlado Menkovski
Micro RNAs (miRNA) are a type of non-coding RNA, which are involved in gene regulation and can be associated with diseases such as cancer, cardiovascular and neurological diseases. As such, identifying the entire genome of miRNA can be of great relevance. Since experimental methods for novel precursor miRNA (pre-miRNA) detection are complex and expensive, computational detection using ML could be useful. Existing ML methods are often complex black boxes, which do not create an interpretable structural description of pre-miRNA. In this paper, we propose a novel framework, which makes use of generative modeling through Variational Auto-Encoders to uncover the generative factors of pre-miRNA. After training the VAE, the pre-miRNA description is developed using a decision tree on the lower dimensional latent space. Applying the framework to miRNA classification, we obtain a high reconstruction and classification performance, while also developing an accurate miRNA description.
LGNov 20, 2023
Node Classification in Random TreesWouter W. L. Nuijten, Vlado Menkovski
We propose a method for the classification of objects that are structured as random trees. Our aim is to model a distribution over the node label assignments in settings where the tree data structure is associated with node attributes (typically high dimensional embeddings). The tree topology is not predetermined and none of the label assignments are present during inference. Other methods that produce a distribution over node label assignment in trees (or more generally in graphs) either assume conditional independence of the label assignment, operate on a fixed graph topology, or require part of the node labels to be observed. Our method defines a Markov Network with the corresponding topology of the random tree and an associated Gibbs distribution. We parameterize the Gibbs distribution with a Graph Neural Network that operates on the random tree and the node embeddings. This allows us to estimate the likelihood of node assignments for a given random tree and use MCMC to sample from the distribution of node assignments. We evaluate our method on the tasks of node classification in trees on the Stanford Sentiment Treebank dataset. Our method outperforms the baselines on this dataset, demonstrating its effectiveness for modeling joint distributions of node labels in random trees.
LGMay 30, 2023Code
Dynamic Sparsity Is Channel-Level Sparsity LearnerLu Yin, Gen Li, Meng Fang et al.
Sparse training has received an upsurging interest in machine learning due to its tantalizing saving potential for the entire training process as well as inference. Dynamic sparse training (DST), as a leading sparse training approach, can train deep neural networks at high sparsity from scratch to match the performance of their dense counterparts. However, most if not all DST prior arts demonstrate their effectiveness on unstructured sparsity with highly irregular sparse patterns, which receives limited support in common hardware. This limitation hinders the usage of DST in practice. In this paper, we propose Channel-aware dynamic sparse (Chase), which for the first time seamlessly translates the promise of unstructured dynamic sparsity to GPU-friendly channel-level sparsity (not fine-grained N:M or group sparsity) during one end-to-end training process, without any ad-hoc operations. The resulting small sparse networks can be directly accelerated by commodity hardware, without using any particularly sparsity-aware hardware accelerators. This appealing outcome is partially motivated by a hidden phenomenon of dynamic sparsity: off-the-shelf unstructured DST implicitly involves biased parameter reallocation across channels, with a large fraction of channels (up to 60%) being sparser than others. By progressively identifying and removing these channels during training, our approach translates unstructured sparsity to channel-wise sparsity. Our experimental results demonstrate that Chase achieves 1.7 X inference throughput speedup on common GPU devices without compromising accuracy with ResNet-50 on ImageNet. We release our codes in https://github.com/luuyin/chase.
SOC-PHDec 2, 2024
Understanding complex crowd dynamics with generative neural simulatorsKoen Minartz, Fleur Hendriks, Simon Martinus Koop et al.
Understanding the dynamics of pedestrian crowds is an outstanding challenge crucial for designing efficient urban infrastructure and ensuring safe crowd management. To this end, both small-scale laboratory and large-scale real-world measurements have been used. However, these approaches respectively lack statistical resolution and parametric controllability, both essential to discovering physical relationships underlying the complex stochastic dynamics of crowds. Here, we establish an investigation paradigm that offers laboratory-like controllability, while ensuring the statistical resolution of large-scale real-world datasets. Using our data-driven Neural Crowd Simulator (NeCS), which we train on large-scale data and validate against key statistical features of crowd dynamics, we show that we can perform effective surrogate crowd dynamics experiments without training on specific scenarios. We not only reproduce known experimental results on pairwise avoidance, but also uncover the vision-guided and topological nature of N-body interactions. These findings show how virtual experiments based on neural simulation enable data-driven scientific discovery.
SOFTApr 26, 2024
Similarity Equivariant Graph Neural Networks for Homogenization of MetamaterialsFleur Hendriks, Vlado Menkovski, Martin Doškář et al.
Soft, porous mechanical metamaterials exhibit pattern transformations that may have important applications in soft robotics, sound reduction and biomedicine. To design these innovative materials, it is important to be able to simulate them accurately and quickly, in order to tune their mechanical properties. Since conventional simulations using the finite element method entail a high computational cost, in this article we aim to develop a machine learning-based approach that scales favorably to serve as a surrogate model. To ensure that the model is also able to handle various microstructures, including those not encountered during training, we include the microstructure as part of the network input. Therefore, we introduce a graph neural network that predicts global quantities (energy, stress stiffness) as well as the pattern transformations that occur (the kinematics). To make our model as accurate and data-efficient as possible, various symmetries are incorporated into the model. The starting point is an E(n)-equivariant graph neural network (which respects translation, rotation and reflection) that has periodic boundary conditions (i.e., it is in-/equivariant with respect to the choice of RVE), is scale in-/equivariant, can simulate large deformations, and can predict scalars, vectors as well as second and fourth order tensors (specifically energy, stress and stiffness). The incorporation of scale equivariance makes the model equivariant with respect to the similarities group, of which the Euclidean group E(n) is a subgroup. We show that this network is more accurate and data-efficient than graph neural networks with fewer symmetries. To create an efficient graph representation of the finite element discretization, we use only the internal geometrical hole boundaries from the finite element mesh to achieve a better speed-up and scaling with the mesh size.
PLASM-PHFeb 24, 2025
Robust Confinement State Classification with Uncertainty Quantification through Ensembled Data-Driven MethodsYoeri Poels, Cristina Venturini, Alessandro Pau et al.
Maximizing fusion performance in tokamaks relies on high energy confinement, often achieved through distinct operating regimes. The automated labeling of these confinement states is crucial to enable large-scale analyses or for real-time control applications. While this task becomes difficult to automate near state transitions or in marginal scenarios, much success has been achieved with data-driven models. However, these methods generally provide predictions as point estimates, and cannot adequately deal with missing and/or broken input signals. To enable wide-range applicability, we develop methods for confinement state classification with uncertainty quantification and model robustness. We focus on off-line analysis for TCV discharges, distinguishing L-mode, H-mode, and an in-between dithering phase (D). We propose ensembling data-driven methods on two axes: model formulations and feature sets. The former considers a dynamic formulation based on a recurrent Fourier Neural Operator-architecture and a static formulation based on gradient-boosted decision trees. These models are trained using multiple feature groupings categorized by diagnostic system or physical quantity. A dataset of 302 TCV discharges is fully labeled, and will be publicly released. We evaluate our method quantitatively using Cohen's kappa coefficient for predictive performance and the Expected Calibration Error for the uncertainty calibration. Furthermore, we discuss performance using a variety of common and alternative scenarios, the performance of individual components, out-of-distribution performance, cases of broken or missing signals, and evaluate conditionally-averaged behavior around different state transitions. Overall, the proposed method can distinguish L, D and H-mode with high performance, can cope with missing or broken signals, and provides meaningful uncertainty estimates.
LGMay 24, 2025
Flow Matching for Geometric Trajectory SimulationKiet Bennema ten Brinke, Koen Minartz, Vlado Menkovski
The simulation of N-body systems is a fundamental problem with applications in a wide range of fields, such as molecular dynamics, biochemistry, and pedestrian dynamics. Machine learning has become an invaluable tool for scaling physics-based simulators and developing models directly from experimental data. In particular, recent advances based on deep generative modeling and geometric deep learning have enabled probabilistic simulation by modeling complex distributions over trajectories while respecting the permutation symmetry that is fundamental to N-body systems. However, to generate realistic trajectories, existing methods must learn complex transformations starting from uninformed noise and do not allow for the exploitation of domain-informed priors. In this work, we propose STFlow to address this limitation. By leveraging flow matching and data-dependent couplings, STFlow facilitates physics-informed simulation of geometric trajectories without sacrificing model expressivity or scalability. Our evaluation on N-body dynamical systems, molecular dynamics, and pedestrian dynamics benchmarks shows that STFlow produces significantly lower prediction errors while enabling more efficient inference, highlighting the benefits of employing physics-informed prior distributions in probabilistic geometric trajectory modeling.
AISep 12, 2025
Towards Fully Automated Molecular Simulations: Multi-Agent Framework for Simulation Setup and Force Field ExtractionMarko Petković, Vlado Menkovski, Sofía Calero
Automated characterization of porous materials has the potential to accelerate materials discovery, but it remains limited by the complexity of simulation setup and force field selection. We propose a multi-agent framework in which LLM-based agents can autonomously understand a characterization task, plan appropriate simulations, assemble relevant force fields, execute them and interpret their results to guide subsequent steps. As a first step toward this vision, we present a multi-agent system for literature-informed force field extraction and automated RASPA simulation setup. Initial evaluations demonstrate high correctness and reproducibility, highlighting this approach's potential to enable fully autonomous, scalable materials characterization.
LGSep 3, 2025
Equivariant Flow Matching for Symmetry-Breaking Bifurcation ProblemsFleur Hendriks, Ondřej Rokoš, Martin Doškář et al.
Bifurcation phenomena in nonlinear dynamical systems often lead to multiple coexisting stable solutions, particularly in the presence of symmetry breaking. Deterministic machine learning models struggle to capture this multiplicity, averaging over solutions and failing to represent lower-symmetry outcomes. In this work, we propose a generative framework based on flow matching to model the full probability distribution over bifurcation outcomes. Our method enables direct sampling of multiple valid solutions while preserving system symmetries through equivariant modeling. We introduce a symmetric matching strategy that aligns predicted and target outputs under group actions, allowing accurate learning in equivariant settings. We validate our approach on a range of systems, from toy models to complex physical problems such as buckling beams and the Allen-Cahn equation. Our results demonstrate that flow matching significantly outperforms non-probabilistic and variational methods in capturing multimodal distributions and symmetry-breaking bifurcations, offering a principled and scalable solution for modeling multistability in high-dimensional systems.
COAug 23, 2025
Score Matching on Large Geometric Graphs for Cosmology GenerationDiana-Alexandra Onutu, Yue Zhao, Joaquin Vanschoren et al.
Generative models are a promising tool to produce cosmological simulations but face significant challenges in scalability, physical consistency, and adherence to domain symmetries, limiting their utility as alternatives to $N$-body simulations. To address these limitations, we introduce a score-based generative model with an equivariant graph neural network that simulates gravitational clustering of galaxies across cosmologies starting from an informed prior, respects periodic boundaries, and scales to full galaxy counts in simulations. A novel topology-aware noise schedule, crucial for large geometric graphs, is introduced. The proposed equivariant score-based model successfully generates full-scale cosmological point clouds of up to 600,000 halos, respects periodicity and a uniform prior, and outperforms existing diffusion models in capturing clustering statistics while offering significant computational advantages. This work advances cosmology by introducing a generative model designed to closely resemble the underlying gravitational clustering of structure formation, moving closer to physically realistic and efficient simulators for the evolution of large-scale structures in the universe.
PLASM-PHApr 24, 2025
Plasma State Monitoring and Disruption Characterization using Multimodal VAEsYoeri Poels, Alessandro Pau, Christian Donner et al.
When a plasma disrupts in a tokamak, significant heat and electromagnetic loads are deposited onto the surrounding device components. These forces scale with plasma current and magnetic field strength, making disruptions one of the key challenges for future devices. Unfortunately, disruptions are not fully understood, with many different underlying causes that are difficult to anticipate. Data-driven models have shown success in predicting them, but they only provide limited interpretability. On the other hand, large-scale statistical analyses have been a great asset to understanding disruptive patterns. In this paper, we leverage data-driven methods to find an interpretable representation of the plasma state for disruption characterization. Specifically, we use a latent variable model to represent diagnostic measurements as a low-dimensional, latent representation. We build upon the Variational Autoencoder (VAE) framework, and extend it for (1) continuous projections of plasma trajectories; (2) a multimodal structure to separate operating regimes; and (3) separation with respect to disruptive regimes. Subsequently, we can identify continuous indicators for the disruption rate and the disruptivity based on statistical properties of measurement data. The proposed method is demonstrated using a dataset of approximately 1600 TCV discharges, selecting for flat-top disruptions or regular terminations. We evaluate the method with respect to (1) the identified disruption risk and its correlation with other plasma properties; (2) the ability to distinguish different types of disruptions; and (3) downstream analyses. For the latter, we conduct a demonstrative study on identifying parameters connected to disruptions using counterfactual-like analysis. Overall, the method can adequately identify distinct operating regimes characterized by varying proximity to disruptions in an interpretable manner.
MTRL-SCIMar 26, 2025
Symmetry-Informed Graph Neural Networks for Carbon Dioxide Isotherm and Adsorption Prediction in Aluminum-Substituted ZeolitesMarko Petković, José-Manuel Vicent Luna, Elīza Beate Dinne et al.
Accurately predicting adsorption properties in nanoporous materials using Deep Learning models remains a challenging task. This challenge becomes even more pronounced when attempting to generalize to structures that were not part of the training data.. In this work, we introduce SymGNN, a graph neural network architecture that leverages material symmetries to improve adsorption property prediction. By incorporating symmetry operations into the message-passing mechanism, our model enhances parameter sharing across different zeolite topologies, leading to improved generalization. We evaluate SymGNN on both interpolation and generalization tasks, demonstrating that it successfully captures key adsorption trends, including the influence of both the framework and aluminium distribution on CO$_2$ adsorption. Furthermore, we apply our model to the characterization of experimental adsorption isotherms, using a genetic algorithm to infer likely aluminium distributions. Our results highlight the effectiveness of machine learning models trained on simulations for studying real materials and suggest promising directions for fine-tuning with experimental data and generative approaches for the inverse design of multifunctional nanomaterials.
LGFeb 4, 2025
Deep Neural Cellular Potts ModelsKoen Minartz, Tim d'Hondt, Leon Hillmann et al.
The cellular Potts model (CPM) is a powerful computational method for simulating collective spatiotemporal dynamics of biological cells. To drive the dynamics, CPMs rely on physics-inspired Hamiltonians. However, as first principles remain elusive in biology, these Hamiltonians only approximate the full complexity of real multicellular systems. To address this limitation, we propose NeuralCPM, a more expressive cellular Potts model that can be trained directly on observational data. At the core of NeuralCPM lies the Neural Hamiltonian, a neural network architecture that respects universal symmetries in collective cellular dynamics. Moreover, this approach enables seamless integration of domain knowledge by combining known biological mechanisms and the expressive Neural Hamiltonian into a hybrid model. Our evaluation with synthetic and real-world multicellular systems demonstrates that NeuralCPM is able to model cellular dynamics that cannot be accounted for by traditional analytical Hamiltonians.
CVDec 17, 2024
Aspect-Based Few-Shot LearningTim van Engeland, Lu Yin, Vlado Menkovski
We generalize the formulation of few-shot learning by introducing the concept of an aspect. In the traditional formulation of few-shot learning, there is an underlying assumption that a single "true" label defines the content of each data point. This label serves as a basis for the comparison between the query object and the objects in the support set. However, when a human expert is asked to execute the same task without a predefined set of labels, they typically consider the rest of the data points in the support set as context. This context specifies the level of abstraction and the aspect from which the comparison can be made. In this work, we introduce a novel architecture and training procedure that develops a context given the query and support set and implements aspect-based few-shot learning that is not limited to a predetermined set of classes. We demonstrate that our method is capable of forming and using an aspect for few-shot learning on the Geometric Shapes and Sprites dataset. The results validate the feasibility of our approach compared to traditional few-shot learning.
QMApr 12, 2024
VADA: a Data-Driven Simulator for Nanopore SequencingJonas Niederle, Simon Koop, Marc Pagès-Gallego et al.
Nanopore sequencing offers the ability for real-time analysis of long DNA sequences at a low cost, enabling new applications such as early detection of cancer. Due to the complex nature of nanopore measurements and the high cost of obtaining ground truth datasets, there is a need for nanopore simulators. Existing simulators rely on handcrafted rules and parameters and do not learn an internal representation that would allow for analysing underlying biological factors of interest. Instead, we propose VADA, a purely data-driven method for simulating nanopores based on an autoregressive latent variable model. We embed subsequences of DNA and introduce a conditional prior to address the challenge of a collapsing conditioning. We introduce an auxiliary regressor on the latent variable to encourage our model to learn an informative latent representation. We empirically demonstrate that our model achieves competitive simulation performance on experimental nanopore data. Moreover, we show we have learned an informative latent representation that is predictive of the DNA labels. We hypothesize that other biological factors of interest, beyond the DNA labels, can potentially be extracted from such a learned latent representation.
MTRL-SCIMar 19, 2024
Graph Neural Networks for Carbon Dioxide Adsorption Prediction in Aluminium-Exchanged ZeolitesMarko Petković, José Manuel Vicent-Luna, Vlado Menkovski et al.
The ability to efficiently predict adsorption properties of zeolites can be of large benefit in accelerating the design process of novel materials. The existing configuration space for these materials is wide, while existing molecular simulation methods are computationally expensive. In this work, we propose a model which is 4 to 5 orders of magnitude faster at adsorption properties compared to molecular simulations. To validate the model, we generated datasets containing various aluminium configurations for the MOR, MFI, RHO and ITW zeolites along with their heat of adsorptions and Henry coefficients for CO$_2$, obtained from Monte Carlo simulations. The predictions obtained from the Machine Learning model are in agreement with the values obtained from the Monte Carlo simulations, confirming that the model can be used for property prediction. Furthermore, we show that the model can be used for identifying adsorption sites. Finally, we evaluate the capability of our model for generating novel zeolite configurations by using it in combination with a genetic algorithm.
PLASM-PHMay 30, 2023
Fast Dynamic 1D Simulation of Divertor Plasmas with Neural PDE SurrogatesYoeri Poels, Gijs Derks, Egbert Westerhof et al.
Managing divertor plasmas is crucial for operating reactor scale tokamak devices due to heat and particle flux constraints on the divertor target. Simulation is an important tool to understand and control these plasmas, however, for real-time applications or exhaustive parameter scans only simple approximations are currently fast enough. We address this lack of fast simulators using neural PDE surrogates, data-driven neural network-based surrogate models trained using solutions generated with a classical numerical method. The surrogate approximates a time-stepping operator that evolves the full spatial solution of a reference physics-based model over time. We use DIV1D, a 1D dynamic model of the divertor plasma, as reference model to generate data. DIV1D's domain covers a 1D heat flux tube from the X-point (upstream) to the target. We simulate a realistic TCV divertor plasma with dynamics induced by upstream density ramps and provide an exploratory outlook towards fast transients. State-of-the-art neural PDE surrogates are evaluated in a common framework and extended for properties of the DIV1D data. We evaluate (1) the speed-accuracy trade-off; (2) recreating non-linear behavior; (3) data efficiency; and (4) parameter inter- and extrapolation. Once trained, neural PDE surrogates can faithfully approximate DIV1D's divertor plasma dynamics at sub real-time computation speeds: In the proposed configuration, 2ms of plasma dynamics can be computed in $\approx$0.63ms of wall-clock time, several orders of magnitude faster than DIV1D.
LGMay 23, 2023
Equivariant Neural Simulators for Stochastic Spatiotemporal DynamicsKoen Minartz, Yoeri Poels, Simon Koop et al.
Neural networks are emerging as a tool for scalable data-driven simulation of high-dimensional dynamical systems, especially in settings where numerical methods are infeasible or computationally expensive. Notably, it has been shown that incorporating domain symmetries in deterministic neural simulators can substantially improve their accuracy, sample efficiency, and parameter efficiency. However, to incorporate symmetries in probabilistic neural simulators that can simulate stochastic phenomena, we need a model that produces equivariant distributions over trajectories, rather than equivariant function approximations. In this paper, we propose Equivariant Probabilistic Neural Simulation (EPNS), a framework for autoregressive probabilistic modeling of equivariant distributions over system evolutions. We use EPNS to design models for a stochastic n-body system and stochastic cellular dynamics. Our results show that EPNS considerably outperforms existing neural network-based methods for probabilistic simulation. More specifically, we demonstrate that incorporating equivariance in EPNS improves simulation quality, data efficiency, rollout stability, and uncertainty quantification. We conclude that EPNS is a promising method for efficient and effective data-driven probabilistic simulation in a diverse range of domains.
CLMay 15, 2023
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference ChecklistIftitahu Ni'mah, Meng Fang, Vlado Menkovski et al.
In this study, we analyze automatic evaluation metrics for Natural Language Generation (NLG), specifically task-agnostic metrics and human-aligned metrics. Task-agnostic metrics, such as Perplexity, BLEU, BERTScore, are cost-effective and highly adaptable to diverse NLG tasks, yet they have a weak correlation with human. Human-aligned metrics (CTC, CtrlEval, UniEval) improves correlation level by incorporating desirable human-like qualities as training objective. However, their effectiveness at discerning system-level performance and quality of system outputs remain unclear. We present metric preference checklist as a framework to assess the effectiveness of automatic metrics in three NLG tasks: Text Summarization, Dialogue Response Generation, and Controlled Generation. Our proposed framework provides access: (i) for verifying whether automatic metrics are faithful to human preference, regardless of their correlation level to human; and (ii) for inspecting the strengths and limitations of NLG systems via pairwise evaluation. We show that automatic metrics provide a better guidance than human on discriminating system-level performance in Text Summarization and Controlled Generation tasks. We also show that multi-aspect human-aligned metric (UniEval) is not necessarily dominant over single-aspect human-aligned metrics (CTC, CtrlEval) and task-agnostic metrics (BLEU, BERTScore), particularly in Controlled Generation tasks.
CVDec 16, 2021
Semantic-Based Few-Shot Learning by Interactive Psychometric TestingLu Yin, Vlado Menkovski, Yulong Pei et al.
Few-shot classification tasks aim to classify images in query sets based on only a few labeled examples in support sets. Most studies usually assume that each image in a task has a single and unique class association. Under these assumptions, these algorithms may not be able to identify the proper class assignment when there is no exact matching between support and query classes. For example, given a few images of lions, bikes, and apples to classify a tiger. However, in a more general setting, we could consider the higher-level concept, the large carnivores, to match the tiger to the lion for semantic classification. Existing studies rarely considered this situation due to the incompatibility of label-based supervision with complex conception relationships. In this work, we advance the few-shot learning towards this more challenging scenario, the semantic-based few-shot learning, and propose a method to address the paradigm by capturing the inner semantic relationships using interactive psychometric learning. The experiment results on the CIFAR-100 dataset show the superiority of our method for the semantic-based few-shot learning compared to the baseline.
LGOct 1, 2021
Calibrated Adversarial TrainingTianjin Huang, Vlado Menkovski, Yulong Pei et al.
Adversarial training is an approach of increasing the robustness of models to adversarial attacks by including adversarial examples in the training set. One major challenge of producing adversarial examples is to contain sufficient perturbation in the example to flip the model's output while not making severe changes in the example's semantical content. Exuberant change in the semantical content could also change the true label of the example. Adding such examples to the training set results in adverse effects. In this paper, we present the Calibrated Adversarial Training, a method that reduces the adverse effects of semantic perturbations in adversarial training. The method produces pixel-level adaptations to the perturbations based on novel calibrated robust error. We provide theoretical analysis on the calibrated robust error and derive an upper bound for it. Our empirical results show a superior performance of the Calibrated Adversarial Training over a number of public datasets.
LGSep 13, 2021
Process Discovery Using Graph Neural NetworksDominique Sommers, Vlado Menkovski, Dirk Fahland
Automatically discovering a process model from an event log is the prime problem in process mining. This task is so far approached as an unsupervised learning problem through graph synthesis algorithms. Algorithmic design decisions and heuristics allow for efficiently finding models in a reduced search space. However, design decisions and heuristics are derived from assumptions about how a given behavioral description - an event log - translates into a process model and were not learned from actual models which introduce biases in the solutions. In this paper, we explore the problem of supervised learning of a process discovery technique D. We introduce a technique for training an ML-based model D using graph convolutional neural networks; D translates a given input event log into a sound Petri net. We show that training D on synthetically generated pairs of input logs and output models allows D to translate previously unseen synthetic and several real-life event logs into sound, arbitrarily structured models of comparable accuracy and simplicity as existing state of the art techniques for discovering imperative process models. We analyze the limitations of the proposed technique and outline alleys for future work.
CLAug 27, 2021
ProtoInfoMax: Prototypical Networks with Mutual Information Maximization for Out-of-Domain DetectionIftitahu Ni'mah, Meng Fang, Vlado Menkovski et al.
The ability to detect Out-of-Domain (OOD) inputs has been a critical requirement in many real-world NLP applications. For example, intent classification in dialogue systems. The reason is that the inclusion of unsupported OOD inputs may lead to catastrophic failure of systems. However, it remains an empirical question whether current methods can tackle such problems reliably in a realistic scenario where zero OOD training data is available. In this study, we propose ProtoInfoMax, a new architecture that extends Prototypical Networks to simultaneously process in-domain and OOD sentences via Mutual Information Maximization (InfoMax) objective. Experimental results show that our proposed method can substantially improve performance up to 20% for OOD detection in low resource settings of text classification. We also show that ProtoInfoMax is less prone to typical overconfidence errors of Neural Networks, leading to more reliable prediction results.
LGAug 20, 2021
VAE-CE: Visual Contrastive Explanation using Disentangled VAEsYoeri Poels, Vlado Menkovski
The goal of a classification model is to assign the correct labels to data. In most cases, this data is not fully described by the given set of labels. Often a rich set of meaningful concepts exist in the domain that can much more precisely describe each datapoint. Such concepts can also be highly useful for interpreting the model's classifications. In this paper we propose a model, denoted as Variational Autoencoder-based Contrastive Explanation (VAE-CE), that represents data with high-level concepts and uses this representation for both classification and generating explanations. The explanations are produced in a contrastive manner, conveying why a datapoint is assigned to one class rather than an alternative class. An explanation is specified as a set of transformations of the input datapoint, with each step depicting a concept changing towards the contrastive class. We build the model using a disentangled VAE, extended with a new supervised method for disentangling individual dimensions. An analysis on synthetic data and MNIST shows that the approaches to both disentanglement and explanation provide benefits over other methods.
CVJul 7, 2021
Hierarchical Semantic Segmentation using Psychometric LearningLu Yin, Vlado Menkovski, Shiwei Liu et al.
Assigning meaning to parts of image data is the goal of semantic image segmentation. Machine learning methods, specifically supervised learning is commonly used in a variety of tasks formulated as semantic segmentation. One of the major challenges in the supervised learning approaches is expressing and collecting the rich knowledge that experts have with respect to the meaning present in the image data. Towards this, typically a fixed set of labels is specified and experts are tasked with annotating the pixels, patches or segments in the images with the given labels. In general, however, the set of classes does not fully capture the rich semantic information present in the images. For example, in medical imaging such as histology images, the different parts of cells could be grouped and sub-grouped based on the expertise of the pathologist. To achieve such a precise semantic representation of the concepts in the image, we need access to the full depth of knowledge of the annotator. In this work, we develop a novel approach to collect segmentation annotations from experts based on psychometric testing. Our method consists of the psychometric testing procedure, active query selection, query enhancement, and a deep metric learning model to achieve a patch-level image embedding that allows for semantic segmentation of images. We show the merits of our method with evaluation on the synthetically generated image, aerial image and histology image.
LGJul 6, 2021
On Generalization of Graph Autoencoders with Adversarial TrainingTianjin Huang, Yulong Pei, Vlado Menkovski et al.
Adversarial training is an approach for increasing model's resilience against adversarial perturbations. Such approaches have been demonstrated to result in models with feature representations that generalize better. However, limited works have been done on adversarial training of models on graph data. In this paper, we raise such a question { does adversarial training improve the generalization of graph representations. We formulate L2 and L1 versions of adversarial training in two powerful node embedding methods: graph autoencoder (GAE) and variational graph autoencoder (VGAE). We conduct extensive experiments on three main applications, i.e. link prediction, node clustering, graph anomaly detection of GAE and VGAE, and demonstrate that both L2 and L1 adversarial training boost the generalization of GAE and VGAE.
LGApr 19, 2021
Direction-Aggregated Attack for Transferable Adversarial ExamplesTianjin Huang, Vlado Menkovski, Yulong Pei et al.
Deep neural networks are vulnerable to adversarial examples that are crafted by imposing imperceptible changes to the inputs. However, these adversarial examples are most successful in white-box settings where the model and its parameters are available. Finding adversarial examples that are transferable to other models or developed in a black-box setting is significantly more difficult. In this paper, we propose the Direction-Aggregated adversarial attacks that deliver transferable adversarial examples. Our method utilizes aggregated direction during the attack process for avoiding the generated adversarial examples overfitting to the white-box model. Extensive experiments on ImageNet show that our proposed method improves the transferability of adversarial examples significantly and outperforms state-of-the-art attacks, especially against adversarial robust models. The best averaged attack success rates of our proposed method reaches 94.6\% against three adversarial trained models and 94.8\% against five defense methods. It also reveals that current defense approaches do not prevent transferable adversarial attacks.
SIApr 16, 2021
Hop-Count Based Self-Supervised Anomaly Detection on Attributed NetworksTianjin Huang, Yulong Pei, Vlado Menkovski et al.
Recent years have witnessed an upsurge of interest in the problem of anomaly detection on attributed networks due to its importance in both research and practice. Although various approaches have been proposed to solve this problem, two major limitations exist: (1) unsupervised approaches usually work much less efficiently due to the lack of supervisory signal, and (2) existing anomaly detection methods only use local contextual information to detect anomalous nodes, e.g., one- or two-hop information, but ignore the global contextual information. Since anomalous nodes differ from normal nodes in structures and attributes, it is intuitive that the distance between anomalous nodes and their neighbors should be larger than that between normal nodes and their neighbors if we remove the edges connecting anomalous and normal nodes. Thus, hop counts based on both global and local contextual information can be served as the indicators of anomaly. Motivated by this intuition, we propose a hop-count based model (HCM) to detect anomalies by modeling both local and global contextual information. To make better use of hop counts for anomaly identification, we propose to use hop counts prediction as a self-supervised task. We design two anomaly scores based on the hop counts prediction via HCM model to identify anomalies. Besides, we employ Bayesian learning to train HCM model for capturing uncertainty in learned parameters and avoiding overfitting. Extensive experiments on real-world attributed networks demonstrate that our proposed model is effective in anomaly detection.
LGNov 26, 2020
A Metric for Linear Symmetry-Based DisentanglementLuis A. Pérez Rey, Loek Tonnaer, Vlado Menkovski et al.
The definition of Linear Symmetry-Based Disentanglement (LSBD) proposed by (Higgins et al., 2018) outlines the properties that should characterize a disentangled representation that captures the symmetries of data. However, it is not clear how to measure the degree to which a data representation fulfills these properties. We propose a metric for the evaluation of the level of LSBD that a data representation achieves. We provide a practical method to evaluate this metric and use it to evaluate the disentanglement of the data representations obtained for three datasets with underlying $SO(2)$ symmetries.
LGNov 11, 2020
Quantifying and Learning Linear Symmetry-Based DisentanglementLoek Tonnaer, Luis A. Pérez Rey, Vlado Menkovski et al.
The definition of Linear Symmetry-Based Disentanglement (LSBD) formalizes the notion of linearly disentangled representations, but there is currently no metric to quantify LSBD. Such a metric is crucial to evaluate LSBD methods and to compare to previous understandings of disentanglement. We propose $\mathcal{D}_\mathrm{LSBD}$, a mathematically sound metric to quantify LSBD, and provide a practical implementation for $\mathrm{SO}(2)$ groups. Furthermore, from this metric we derive LSBD-VAE, a semi-supervised method to learn LSBD representations. We demonstrate the utility of our metric by showing that (1) common VAE-based disentanglement methods don't learn LSBD representations, (2) LSBD-VAE as well as other recent methods can learn LSBD representations, needing only limited supervision on transformations, and (3) various desirable properties expressed by existing disentanglement metrics are also achieved by LSBD representations.
CRNov 7, 2020
Bridging the Performance Gap between FGSM and PGD Adversarial TrainingTianjin Huang, Vlado Menkovski, Yulong Pei et al.
Deep learning achieves state-of-the-art performance in many tasks but exposes to the underlying vulnerability against adversarial examples. Across existing defense techniques, adversarial training with the projected gradient decent attack (adv.PGD) is considered as one of the most effective ways to achieve moderate adversarial robustness. However, adv.PGD requires too much training time since the projected gradient attack (PGD) takes multiple iterations to generate perturbations. On the other hand, adversarial training with the fast gradient sign method (adv.FGSM) takes much less training time since the fast gradient sign method (FGSM) takes one step to generate perturbations but fails to increase adversarial robustness. In this work, we extend adv.FGSM to make it achieve the adversarial robustness of adv.PGD. We demonstrate that the large curvature along FGSM perturbed direction leads to a large difference in performance of adversarial robustness between adv.FGSM and adv.PGD, and therefore propose combining adv.FGSM with a curvature regularization (adv.FGSMR) in order to bridge the performance gap between adv.FGSM and adv.PGD. The experiments show that adv.FGSMR has higher training efficiency than adv.PGD. In addition, it achieves comparable performance of adversarial robustness on MNIST dataset under white-box attack, and it achieves better performance than adv.PGD under white-box attack and effectively defends the transferable adversarial attack on CIFAR-10 dataset.
NESep 22, 2020
Complex Vehicle Routing with Memory Augmented Neural NetworksMarijn van Knippenberg, Mike Holenderski, Vlado Menkovski
Complex real-life routing challenges can be modeled as variations of well-known combinatorial optimization problems. These routing problems have long been studied and are difficult to solve at scale. The particular setting may also make exact formulation difficult. Deep Learning offers an increasingly attractive alternative to traditional solutions, which mainly revolve around the use of various heuristics. Deep Learning may provide solutions which are less time-consuming and of higher quality at large scales, as it generally does not need to generate solutions in an iterative manner, and Deep Learning models have shown a surprising capacity for solving complex tasks in recent years. Here we consider a particular variation of the Capacitated Vehicle Routing (CVRP) problem and investigate the use of Deep Learning models with explicit memory components. Such memory components may help in gaining insight into the model's decisions as the memory and operations on it can be directly inspected at any time, and may assist in scaling the method to such a size that it becomes viable for industry settings.
LGJun 14, 2020
Explaining Predictions by Approximating the Local Decision BoundaryGeorgios Vlassopoulos, Tim van Erven, Henry Brighton et al.
Constructing accurate model-agnostic explanations for opaque machine learning models remains a challenging task. Classification models for high-dimensional data, like images, are often inherently complex. To reduce this complexity, individual predictions may be explained locally, either in terms of a simpler local surrogate model or by communicating how the predictions contrast with those of another class. However, existing approaches still fall short in the following ways: a) they measure locality using a (Euclidean) metric that is not meaningful for non-linear high-dimensional data; or b) they do not attempt to explain the decision boundary, which is the most relevant characteristic of classifiers that are optimized for classification accuracy; or c) they do not give the user any freedom in specifying attributes that are meaningful to them. We address these issues in a new procedure for local decision boundary approximation (DBA). To construct a meaningful metric, we train a variational autoencoder to learn a Euclidean latent space of encoded data representations. We impose interpretability by exploiting attribute annotations to map the latent space to attributes that are meaningful to the user. A difficulty in evaluating explainability approaches is the lack of a ground truth. We address this by introducing a new benchmark data set with artificially generated Iris images, and showing that we can recover the latent attributes that locally determine the class. We further evaluate our approach on tabular data and on the CelebA image data set.
LGApr 14, 2020
Knowledge Elicitation using Deep Metric Learning and Psychometric TestingLu Yin, Vlado Menkovski, Mykola Pechenizkiy
Knowledge present in a domain is well expressed as relationships between corresponding concepts. For example, in zoology, animal species form complex hierarchies; in genomics, the different (parts of) molecules are organized in groups and subgroups based on their functions; plants, molecules, and astronomical objects all form complex taxonomies. Nevertheless, when applying supervised machine learning (ML) in such domains, we commonly reduce the complex and rich knowledge to a fixed set of labels, and induce a model shows good generalization performance with respect to these labels. The main reason for such a reductionist approach is the difficulty in eliciting the domain knowledge from the experts. Developing a label structure with sufficient fidelity and providing comprehensive multi-label annotation can be exceedingly labor-intensive in many real-world applications. In this paper, we provide a method for efficient hierarchical knowledge elicitation (HKE) from experts working with high-dimensional data such as images or videos. Our method is based on psychometric testing and active deep metric learning. The developed models embed the high-dimensional data in a metric space where distances are semantically meaningful, and the data can be organized in a hierarchical structure. We provide empirical evidence with a series of experiments on a synthetically generated dataset of simple shapes, and Cifar 10 and Fashion-MNIST benchmarks that our method is indeed successful in uncovering hierarchical structures.
LGJan 15, 2020
Causal Discovery from Incomplete Data: A Deep Learning ApproachYuhao Wang, Vlado Menkovski, Hao Wang et al.
As systems are getting more autonomous with the development of artificial intelligence, it is important to discover the causal knowledge from observational sensory inputs. By encoding a series of cause-effect relations between events, causal networks can facilitate the prediction of effects from a given action and analyze their underlying data generation mechanism. However, missing data are ubiquitous in practical scenarios. Directly performing existing casual discovery algorithms on partially observed data may lead to the incorrect inference. To alleviate this issue, we proposed a deep learning framework, dubbed Imputated Causal Learning (ICL), to perform iterative missing data imputation and causal structure discovery. Through extensive simulations on both synthetic and real data, we show that ICL can outperform state-of-the-art methods under different missing data mechanisms.
SOC-PHJan 14, 2020
Pedestrian orientation dynamics from high-fidelity measurementsJoris Willems, Alessandro Corbetta, Vlado Menkovski et al.
We investigate in real-life conditions and with very high accuracy the dynamics of body rotation, or yawing, of walking pedestrians - an highly complex task due to the wide variety in shapes, postures and walking gestures. We propose a novel measurement method based on a deep neural architecture that we train on the basis of generic physical properties of the motion of pedestrians. Specifically, we leverage on the strong statistical correlation between individual velocity and body orientation: the velocity direction is typically orthogonal with respect to the shoulder line. We make the reasonable assumption that this approximation, although instantaneously slightly imperfect, is correct on average. This enables us to use velocity data as training labels for a highly-accurate point-estimator of individual orientation, that we can train with no dedicated annotation labor. We discuss the measurement accuracy and show the error scaling, both on synthetic and real-life data: we show that our method is capable of estimating orientation with an error as low as 7.5 degrees. This tool opens up new possibilities in the studies of human crowd dynamics where orientation is key. By analyzing the dynamics of body rotation in real-life conditions, we show that the instantaneous velocity direction can be described by the combination of orientation and a random delay, where randomness is provided by an Ornstein-Uhlenbeck process centered on an average delay of 100ms. Quantifying these dynamics could have only been possible thanks to a tool as precise as that proposed.
FLU-DYNNov 13, 2019
Deep learning velocity signals allows to quantify turbulence intensityAlessandro Corbetta, Vlado Menkovski, Roberto Benzi et al.
Turbulence, the ubiquitous and chaotic state of fluid motions, is characterized by strong and statistically non-trivial fluctuations of the velocity field, over a wide range of length- and time-scales, and it can be quantitatively described only in terms of statistical averages. Strong non-stationarities hinder the possibility to achieve statistical convergence, making it impossible to define the turbulence intensity and, in particular, its basic dimensionless estimator, the Reynolds number. Here we show that by employing Deep Neural Networks (DNN) we can accurately estimate the Reynolds number within $15\%$ accuracy, from a statistical sample as small as two large-scale eddy-turnover times. In contrast, physics-based statistical estimators are limited by the rate of convergence of the central limit theorem, and provide, for the same statistical sample, an error at least $100$ times larger. Our findings open up new perspectives in the possibility to quantitatively define and, therefore, study highly non-stationary turbulent flows as ordinarily found in nature as well as in industrial processes.
CLSep 17, 2019
BSDAR: Beam Search Decoding with Attention Reward in Neural Keyphrase GenerationIftitahu Ni'mah, Vlado Menkovski, Mykola Pechenizkiy
This study mainly investigates two common decoding problems in neural keyphrase generation: sequence length bias and beam diversity. To tackle the problems, we introduce a beam search decoding strategy based on word-level and ngram-level reward function to constrain and refine Seq2Seq inference at test time. Results show that our simple proposal can overcome the algorithm bias to shorter and nearly identical sequences, resulting in a significant improvement of the decoding performance on generating keyphrases that are present and absent in source text.
LGMay 23, 2019
Hierarchical Annotation of Images with Two-Alternative-Forced-Choice Metric LearningNiels Hellinga, Vlado Menkovski
Many tasks such as retrieval and recommendations can significantly benefit from structuring the data, commonly in a hierarchical way. To achieve this through annotations of high dimensional data such as images or natural text can be significantly labor intensive. We propose an approach for uncovering the hierarchical structure of data based on efficient discriminative testing rather than annotations of individual datapoints. Using two-alternative-forced-choice (2AFC) testing and deep metric learning we achieve embedding of the data in semantic space where we are able to successfully hierarchically cluster. We actively select triplets for the 2AFC test such that the modeling process is highly efficient with respect to the number of tests presented to the annotator. We empirically demonstrate the feasibility of the method by confirming the shape bias on synthetic data and extract hierarchical structure on the Fashion-MNIST dataset to a finer granularity than the original labels.
CVMar 26, 2019
Micro-expression detection in long videos using optical flow and recurrent neural networksMichiel Verburg, Vlado Menkovski
Facial micro-expressions are subtle and involuntary expressions that can reveal concealed emotions. Micro-expressions are an invaluable source of information in application domains such as lie detection, mental health, sentiment analysis and more. One of the biggest challenges in this field of research is the small amount of available spontaneous micro-expression data. However, spontaneous data collection is burdened by time-consuming and expensive annotation. Hence, methods are needed which can reduce the amount of data that annotators have to review. This paper presents a novel micro-expression spotting method using a recurrent neural network (RNN) on optical flow features. We extract Histogram of Oriented Optical Flow (HOOF) features to encode the temporal changes in selected face regions. Finally, the RNN spots short intervals which are likely to contain occurrences of relevant facial micro-movements. The proposed method is evaluated on the SAMM database. Any chance of subject bias is eliminated by training the RNN using Leave-One-Subject-Out cross-validation. Comparing the spotted intervals with the labeled data shows that the method produced 1569 false positives while obtaining a recall of 0.4654. The initial results show that the proposed method would reduce the video length by a factor of 3.5, while still retaining almost half of the relevant micro-movements. Lastly, as the model gets more data, it becomes better at detecting intervals, which makes the proposed method suitable for supporting the annotation process.