Nino Antulov-Fantulin

LG
16papers
551citations
Novelty42%
AI Score44

16 Papers

STAT-MECHJan 12, 2023
Stretched and measured neural predictions of complex network dynamics

Vaiva Vasiliauskaite, Nino Antulov-Fantulin

Differential equations are a ubiquitous tool to study dynamics, ranging from physical systems to complex systems, where a large number of agents interact through a graph with non-trivial topological features. Data-driven approximations of differential equations present a promising alternative to traditional methods for uncovering a model of dynamical systems, especially in complex systems that lack explicit first principles. A recently employed machine learning tool for studying dynamics is neural networks, which can be used for data-driven solution finding or discovery of differential equations. Specifically for the latter task, however, deploying deep learning models in unfamiliar settings - such as predicting dynamics in unobserved state space regions or on novel graphs - can lead to spurious results. Focusing on complex systems whose dynamics are described with a system of first-order differential equations coupled through a graph, we show that extending the model's generalizability beyond traditional statistical learning theory limits is feasible. However, achieving this advanced level of generalization requires neural network models to conform to fundamental assumptions about the dynamical model. Additionally, we propose a statistical significance test to assess prediction quality during inference, enabling the identification of a neural network's confidence level in its predictions.

24.7LGMay 8
Does Your Neural Network Extrapolate? Feature Engineering as Identifiability Bias for OOD Generalization

Leonel Aguilar, Jan Nagler, Christoph Hoelscher et al.

Successful deep neural networks discover salient features of data. We show when and why they fail to learn out-of-distribution (OOD)-relevant representations from an in-distribution (ID) training window. This requires decoupling feature learning from data-generating-process (DGP) identifiability. From a single training window, OOD extrapolation is non-identifiable: infinitely many DGPs are $\varepsilon$-observationally equivalent on the training data but diverge arbitrarily outside it, and no in-distribution criterion alone reliably breaks the tie. A structural commitment, the feature map, label map, and model class $(φ, ψ, \mathcal{M})$, dictates the assumed DGP and governs OOD generalization while leaving ID performance essentially unchanged. When architecture, pretraining, augmentation, input formats, or domain knowledge implicitly inject the missing commitment, the model succeeds. When it cannot infer OOD-relevant structure from ID evidence, it fails. Changing only the representation can make the same architecture, at the same in-distribution loss, differ by ${\sim}520\times$ out of distribution. When the commitment is correct and identifiable, OOD error vanishes. For example, Fourier coordinates turn periodic extrapolation into interpolation on $\mathbb{S}^1$. The same mechanism predicts outcomes in three natural-science settings (mass-action chemistry; Kepler's-third-law exoplanet prediction, $n=2{,}362$; and cross-species coding-DNA detection) and in a 264-run positional-encoding study across Transformer, Mamba, and S4D. Finally, a controlled study shows: correct features are necessary but not sufficient. The model class must express the target, and the transformed training data must cover the relevant representation space.

LGDec 5, 2025
Towards agent-based-model informed neural networks

Nino Antulov-Fantulin

In this article, we present a framework for designing neural networks that remain consistent with the underlying principles of agent-based models. We begin by highlighting the limitations of standard neural differential equations in modeling complex systems, where physical invariants (like energy) are often absent but other constraints (like mass conservation, information locality, bounded rationality) must be enforced. To address this, we introduce Agent-Based-Model informed Neural Networks (ABM-NNs), which leverage restricted graph neural networks and hierarchical decomposition to learn interpretable, structure-preserving dynamics. We validate the framework across three case studies of increasing complexity: (i) a generalized Generalized Lotka--Volterra system, where we recover ground-truth parameters from short trajectories in presence of interventions; (ii) a graph-based SIR contagion model, where our method outperforms state-of-the-art graph learning baselines (GCN, GraphSAGE, Graph Transformer) in out-of-sample forecasting and noise robustness; and (iii) a real-world macroeconomic model of the ten largest economies, where we learn coupled GDP dynamics from empirical data and demonstrate counterfactual analysis for policy interventions

STOct 27, 2021
Ask "Who", Not "What": Bitcoin Volatility Forecasting with Twitter Data

M. Eren Akbiyik, Mert Erkul, Killian Kaempf et al.

Understanding the variations in trading price (volatility), and its response to exogenous information, is a well-researched topic in finance. In this study, we focus on finding stable and accurate volatility predictors for a relatively new asset class of cryptocurrencies, in particular Bitcoin, using deep learning representations of public social media data obtained from Twitter. For our experiments, we extracted semantic information and user statistics from over 30 million Bitcoin-related tweets, in conjunction with 15-minute frequency price data over a horizon of 144 days. Using this data, we built several deep learning architectures that utilized different combinations of the gathered information. For each model, we conducted ablation studies to assess the influence of different components and feature sets over the prediction accuracy. We found statistical evidences for the hypotheses that: (i) temporal convolutional networks perform significantly better than both classical autoregressive models and other deep learning-based architectures in the literature, and (ii) tweet author meta-information, even detached from the tweet itself, is a better predictor of volatility than the semantic content and tweet volume statistics. We demonstrate how different information sets gathered from social media can be utilized in different architectures and how they affect the prediction results. As an additional contribution, we make our dataset public for future research.

LGMar 11, 2021
Implicit energy regularization of neural ordinary-differential-equation control

Lucas Böttcher, Nino Antulov-Fantulin, Thomas Asikis

Although optimal control problems of dynamical systems can be formulated within the framework of variational calculus, their solution for complex systems is often analytically and computationally intractable. In this Letter we present a versatile neural ordinary-differential-equation control (NODEC) framework with implicit energy regularization and use it to obtain neural-network-generated control signals that can steer dynamical systems towards a desired target state within a predefined amount of time. We demonstrate the ability of NODEC to learn control signals that closely resemble those found by corresponding optimal control frameworks in terms of control energy and deviation from the desired target state. Our results suggest that NODEC is capable to solve a wide range of control and optimization problems, including those that are analytically intractable.

STOct 22, 2020
On the impact of publicly available news and information transfer to financial markets

Metod Jazbec, Barna Pásztor, Felix Faltings et al.

We quantify the propagation and absorption of large-scale publicly available news articles from the World Wide Web to financial markets. To extract publicly available information, we use the news archives from the Common Crawl, a nonprofit organization that crawls a large part of the web. We develop a processing pipeline to identify news articles associated with the constituent companies in the S\&P 500 index, an equity market index that measures the stock performance of U.S. companies. Using machine learning techniques, we extract sentiment scores from the Common Crawl News data and employ tools from information theory to quantify the information transfer from public news articles to the U.S. stock market. Furthermore, we analyze and quantify the economic significance of the news-based information with a simple sentiment-based portfolio trading strategy. Our findings provides support for that information in publicly available news on the World Wide Web has a statistically and economically significant impact on events in financial markets.

LGJun 17, 2020
Neural Ordinary Differential Equation Control of Dynamics on Graphs

Thomas Asikis, Lucas Böttcher, Nino Antulov-Fantulin

We study the ability of neural networks to calculate feedback control signals that steer trajectories of continuous time non-linear dynamical systems on graphs, which we represent with neural ordinary differential equations (neural ODEs). To do so, we present a neural-ODE control (NODEC) framework and find that it can learn feedback control signals that drive graph dynamical systems into desired target states. While we use loss functions that do not constrain the control energy, our results show, in accordance with related work, that NODEC produces low energy control signals. Finally, we evaluate the performance and versatility of NODEC against well-known feedback controllers and deep reinforcement learning. We use NODEC to generate feedback controls for systems of more than one thousand coupled, non-linear ODEs that represent epidemic processes and coupled oscillators.

LGMay 28, 2019
Exploring Interpretable LSTM Neural Networks over Multi-Variable Data

Tian Guo, Tao Lin, Nino Antulov-Fantulin

For recurrent neural networks trained on time series with target and exogenous variables, in addition to accurate prediction, it is also desired to provide interpretable insights into the data. In this paper, we explore the structure of LSTM recurrent neural networks to learn variable-wise hidden states, with the aim to capture different dynamics in multi-variable time series and distinguish the contribution of variables to the prediction. With these variable-wise hidden states, a mixture attention mechanism is proposed to model the generative process of the target. Then we develop associated training methods to jointly learn network parameters, variable and temporal importance w.r.t the prediction of the target variable. Extensive experiments on real datasets demonstrate enhanced prediction performance by capturing the dynamics of different variables. Meanwhile, we evaluate the interpretation results both qualitatively and quantitatively. It exhibits the prospect as an end-to-end framework for both forecasting and knowledge extraction over multi-variable data.

LGMay 24, 2019
Low-dimensional statistical manifold embedding of directed graphs

Thorben Funke, Tian Guo, Alen Lancic et al.

We propose a novel node embedding of directed graphs to statistical manifolds, which is based on a global minimization of pairwise relative entropy and graph geodesics in a non-linear way. Each node is encoded with a probability density function over a measurable space. Furthermore, we analyze the connection between the geometrical properties of such embedding and their efficient learning procedure. Extensive experiments show that our proposed embedding is better in preserving the global geodesic information of graphs, as well as outperforming existing embedding models on directed graphs in a variety of evaluation metrics, in an unsupervised setting.

SIMar 27, 2019
Sensing Social Media Signals for Cryptocurrency News

Johannes Beck, Roberta Huang, David Lindner et al.

The ability to track and monitor relevant and important news in real-time is of crucial interest in multiple industrial sectors. In this work, we focus on the set of cryptocurrency news, which recently became of emerging interest to the general and financial audience. In order to track relevant news in real-time, we (i) match news from the web with tweets from social media, (ii) track their intraday tweet activity and (iii) explore different machine learning models for predicting the number of the article mentions on Twitter within the first 24 hours after its publication. We compare several machine learning models, such as linear extrapolation, linear and random forest autoregressive models, and a sequence-to-sequence neural network. We find that the random forest autoregressive model behaves comparably to more complex models in the majority of tasks.

STSep 19, 2018
Inferring short-term volatility indicators from Bitcoin blockchain

Nino Antulov-Fantulin, Dijana Tolic, Matija Piskorec et al.

In this paper, we study the possibility of inferring early warning indicators (EWIs) for periods of extreme bitcoin price volatility using features obtained from Bitcoin daily transaction graphs. We infer the low-dimensional representations of transaction graphs in the time period from 2012 to 2017 using Bitcoin blockchain, and demonstrate how these representations can be used to predict extreme price volatility events. Our EWI, which is obtained with a non-negative decomposition, contains more predictive information than those obtained with singular value decomposition or scalar value of the total Bitcoin transaction volume.

MLFeb 12, 2018
Bitcoin Volatility Forecasting with a Glimpse into Buy and Sell Orders

Tian Guo, Albert Bifet, Nino Antulov-Fantulin

In this paper, we study the ability to make the short-term prediction of the exchange price fluctuations towards the United States dollar for the Bitcoin market. We use the data of realized volatility collected from one of the largest Bitcoin digital trading offices in 2016 and 2017 as well as order information. Experiments are performed to evaluate a variety of statistical and machine learning approaches.

LGOct 16, 2017
Is Simple Better? Revisiting Non-linear Matrix Factorization for Learning Incomplete Ratings

Vaibhav Krishna, Tian Guo, Nino Antulov-Fantulin

Matrix factorization techniques have been widely used as a method for collaborative filtering for recommender systems. In recent times, different variants of deep learning algorithms have been explored in this setting to improve the task of making a personalized recommendation with user-item interaction data. The idea that the mapping between the latent user or item factors and the original features is highly nonlinear suggest that classical matrix factorization techniques are no longer sufficient. In this paper, we propose a multilayer nonlinear semi-nonnegative matrix factorization method, with the motivation that user-item interactions can be modeled more accurately using a linear combination of non-linear item features. Firstly, we learn latent factors for representations of users and items from the designed multilayer nonlinear Semi-NMF approach using explicit ratings. Secondly, the architecture built is compared with deep-learning algorithms like Restricted Boltzmann Machine and state-of-the-art Deep Matrix factorization techniques. By using both supervised rate prediction task and unsupervised clustering in latent item space, we demonstrate that our proposed approach achieves better generalization ability in prediction as well as comparable representation ability as deep matrix factorization in the clustering task.

SIOct 10, 2017
Underestimated cost of targeted attacks on complex networks

Xiao-Long Ren, Niels Gleinig, Dijana Tolic et al.

The robustness of complex networks under targeted attacks is deeply connected to the resilience of complex systems, i.e., the ability to make appropriate responses to the attacks. In this article, we investigated the state-of-the-art targeted node attack algorithms and demonstrate that they become very inefficient when the cost of the attack is taken into consideration. In this paper, we made explicit assumption that the cost of removing a node is proportional to the number of adjacent links that are removed, i.e., higher degree nodes have higher cost. Finally, for the case when it is possible to attack links, we propose a simple and efficient edge removal strategy named Hierarchical Power Iterative Normalized cut (HPI-Ncut).The results on real and artificial networks show that the HPI-Ncut algorithm outperforms all the node removal and link removal attack algorithms when the cost of the attack is taken into consideration. In addition, we show that on sparse networks, the complexity of this hierarchical power iteration edge removal algorithm is only $O(n\log^{2+ε}(n))$.

MLSep 29, 2017
A Nonlinear Orthogonal Non-Negative Matrix Factorization Approach to Subspace Clustering

Dijana Tolic, Nino Antulov-Fantulin, Ivica Kopriva

A recent theoretical analysis shows the equivalence between non-negative matrix factorization (NMF) and spectral clustering based approach to subspace clustering. As NMF and many of its variants are essentially linear, we introduce a nonlinear NMF with explicit orthogonality and derive general kernel-based orthogonal multiplicative update rules to solve the subspace clustering problem. In nonlinear orthogonal NMF framework, we propose two subspace clustering algorithms, named kernel-based non-negative subspace clustering KNSC-Ncut and KNSC-Rcut and establish their connection with spectral normalized cut and ratio cut clustering. We further extend the nonlinear orthogonal NMF framework and introduce a graph regularization to obtain a factorization that respects a local geometric structure of the data after the nonlinear mapping. The proposed NMF-based approach to subspace clustering takes into account the nonlinear nature of the manifold, as well as its intrinsic local geometry, which considerably improves the clustering performance when compared to the several recently proposed state-of-the-art methods.

IRJan 30, 2012
Synthetic sequence generator for recommender systems - memory biased random walk on sequence multilayer network

Nino Antulov-Fantulin, Matko Bosnjak, Vinko Zlatic et al.

Personalized recommender systems rely on each user's personal usage data in the system, in order to assist in decision making. However, privacy policies protecting users' rights prevent these highly personal data from being publicly available to a wider researcher audience. In this work, we propose a memory biased random walk model on multilayer sequence network, as a generator of synthetic sequential data for recommender systems. We demonstrate the applicability of the synthetic data in training recommender system models for cases when privacy policies restrict clickstream publishing.