Dionissios T. Hristopulos

ML
h-index28
7papers
77citations
Novelty41%
AI Score28

7 Papers

MLFeb 19, 2023
Non-separable Covariance Kernels for Spatiotemporal Gaussian Processes based on a Hybrid Spectral Method and the Harmonic Oscillator

Dionissios T. Hristopulos

Gaussian processes provide a flexible, non-parametric framework for the approximation of functions in high-dimensional spaces. The covariance kernel is the main engine of Gaussian processes, incorporating correlations that underpin the predictive distribution. For applications with spatiotemporal datasets, suitable kernels should model joint spatial and temporal dependence. Separable space-time covariance kernels offer simplicity and computational efficiency. However, non-separable kernels include space-time interactions that better capture observed correlations. Most non-separable kernels that admit explicit expressions are based on mathematical considerations (admissibility conditions) rather than first-principles derivations. We present a hybrid spectral approach for generating covariance kernels which is based on physical arguments. We use this approach to derive a new class of physically motivated, non-separable covariance kernels which have their roots in the stochastic, linear, damped, harmonic oscillator (LDHO). The new kernels incorporate functions with both monotonic and oscillatory decay of space-time correlations. The LDHO covariance kernels involve space-time interactions which are introduced by dispersion relations that modulate the oscillator coefficients. We derive explicit relations for the spatiotemporal covariance kernels in the three oscillator regimes (underdamping, critical damping, overdamping) and investigate their properties. We further illustrate the hybrid spectral method by deriving covariance kernels that are based on the Ornstein-Uhlenbeck model.

MLSep 28, 2023
A parsimonious, computationally efficient machine learning method for spatial regression

Milan Žukovič, Dionissios T. Hristopulos

We introduce the modified planar rotator method (MPRS), a physically inspired machine learning method for spatial/temporal regression. MPRS is a non-parametric model which incorporates spatial or temporal correlations via short-range, distance-dependent ``interactions'' without assuming a specific form for the underlying probability distribution. Predictions are obtained by means of a fully autonomous learning algorithm which employs equilibrium conditional Monte Carlo simulations. MPRS is able to handle scattered data and arbitrary spatial dimensions. We report tests on various synthetic and real-word data in one, two and three dimensions which demonstrate that the MPRS prediction performance (without parameter tuning) is competitive with standard interpolation methods such as ordinary kriging and inverse distance weighting. In particular, MPRS is a particularly effective gap-filling method for rough and non-Gaussian data (e.g., daily precipitation time series). MPRS shows superior computational efficiency and scalability for large samples. Massive data sets involving millions of nodes can be processed in a few seconds on a standard personal computer.

DATA-ANJan 10, 2024
Information Flow Rate for Cross-Correlated Stochastic Processes

Dionissios T. Hristopulos

Causal inference seeks to identify cause-and-effect interactions in coupled systems. A recently proposed method by Liang detects causal relations by quantifying the direction and magnitude of information flow between time series. The theoretical formulation of information flow for stochastic dynamical systems provides a general expression and a data-driven statistic for the rate of entropy transfer between different system units. To advance understanding of information flow rate in terms of intuitive concepts and physically meaningful parameters, we investigate statistical properties of the data-driven information flow rate between coupled stochastic processes. We derive relations between the expectation of the information flow rate statistic and properties of the auto- and cross-correlation functions. Thus, we elucidate the dependence of the information flow rate on the analytical properties and characteristic times of the correlation functions. Our analysis provides insight into the influence of the sampling step, the strength of cross-correlations, and the temporal delay of correlations on information flow rate. We support the theoretical results with numerical simulations of correlated Gaussian processes.

MEMay 17, 2025
Stochastic Processes with Modified Lognormal Distribution Featuring Flexible Upper Tail

Dionissios T. Hristopulos, Anastassia Baxevani, Giorgio Kaniadakis

Asymmetric, non-Gaussian probability distributions are often observed in the analysis of natural and engineering datasets. The lognormal distribution is a standard model for data with skewed frequency histograms and fat tails. However, the lognormal law severely restricts the asymptotic dependence of the probability density and the hazard function for high values. Herein we present a family of three-parameter non-Gaussian probability density functions that are based on generalized kappa-exponential and kappa-logarithm functions and investigate its mathematical properties. These kappa-lognormal densities represent continuous deformations of the lognormal with lighter right tails, controlled by the parameter kappa. In addition, bimodal distributions are obtained for certain parameter combinations. We derive closed-form analytic expressions for the main statistical functions of the kappa-lognormal distribution. For the moments, we derive bounds that are based on hypergeometric functions as well as series expansions. Explicit expressions for the gradient and Hessian of the negative log-likelihood are obtained to facilitate numerical maximum-likelihood estimates of the kappa-lognormal parameters from data. We also formulate a joint probability density function for kappa-lognormal stochastic processes by applying Jacobi's multivariate theorem to a latent Gaussian process. Estimation of the kappa-lognormal distribution based on synthetic and real data is explored. Furthermore, we investigate applications of kappa-lognormal processes with different covariance kernels in time series forecasting and spatial interpolation using warped Gaussian process regression. Our results are of practical interest for modeling skewed distributions in various scientific and engineering fields.

APSep 21, 2021
Non-parametric Kernel-Based Estimation of Probability Distributions for Precipitation Modeling

Andrew Pavlides, Vasiliki Agou, Dionissios T. Hristopulos

The probability distribution of precipitation amount strongly depends on geography, climate zone, and time scale considered. Closed-form parametric probability distributions are not sufficiently flexible to provide accurate and universal models for precipitation amount over different time scales. In this paper we derive non-parametric estimates of the cumulative distribution function (CDF) of precipitation amount for wet periods. The CDF estimates are obtained by integrating the kernel density estimator leading to semi-explicit CDF expressions for different kernel functions. We investigate an adaptive plug-in bandwidth (KCDE), using both synthetic data sets and reanalysis precipitation data from the Mediterranean island of Crete (Greece). We show that KCDE provides better estimates of the probability distribution than the standard empirical (staircase) estimate and kernel-based estimates that use the normal reference bandwidth. We also demonstrate that KCDE enables the simulation of non-parametric precipitation amount distributions by means of the inverse transform sampling method.

STJan 7, 2020
Stochastic Local Interaction Model: Geostatistics without Kriging

Dionissios T. Hristopulos, Andreas Pavlides, Vasiliki D. Agou et al.

Classical geostatistical methods face serious computational challenges if they are confronted with large sets of spatially distributed data. We present a simplified stochastic local interaction (SLI) model for computationally efficient spatial prediction that can handle large data. The SLI method constructs a spatial interaction matrix (precision matrix) that accounts for the data values, their locations, and the sampling density variations without user input. We show that this precision matrix is strictly positive definite. The SLI approach does not require matrix inversion for parameter estimation, spatial prediction, and uncertainty estimation, leading to computational procedures that are significantly less intensive computationally than kriging. The precision matrix involves compact kernel functions (spherical, quadratic, etc.) which enable the application of sparse matrix methods, thus improving computational time and memory requirements. We investigate the proposed SLI method with a data set that includes approximately 11500 drill-hole data of coal thickness from Campbell County (Wyoming, USA). We also compare SLI with ordinary kriging (OK) in terms of estimation performance, using cross validation analysis, and computational time. According to the validation measures used, SLI performs slightly better in estimating seam thickness than OK in terms of cross-validation measures. In terms of computation time, SLI prediction is 3 to 25 times (depending on the size of the kriging neighborhood) faster than OK for the same grid size.

LGJan 16, 2015
Stochastic Local Interaction (SLI) Model: Interfacing Machine Learning and Geostatistics

Dionissios T. Hristopulos

Machine learning and geostatistics are powerful mathematical frameworks for modeling spatial data. Both approaches, however, suffer from poor scaling of the required computational resources for large data applications. We present the Stochastic Local Interaction (SLI) model, which employs a local representation to improve computational efficiency. SLI combines geostatistics and machine learning with ideas from statistical physics and computational geometry. It is based on a joint probability density function defined by an energy functional which involves local interactions implemented by means of kernel functions with adaptive local kernel bandwidths. SLI is expressed in terms of an explicit, typically sparse, precision (inverse covariance) matrix. This representation leads to a semi-analytical expression for interpolation (prediction), which is valid in any number of dimensions and avoids the computationally costly covariance matrix inversion.