Václav Šmídl

LG
h-index33
16papers
79citations
Novelty46%
AI Score41

16 Papers

LGAug 14, 2024
Sum-Product-Set Networks: Deep Tractable Models for Tree-Structured Graphs

Milan Papež, Martin Rektoris, Tomáš Pevný et al.

Daily internet communication relies heavily on tree-structured graphs, embodied by popular data formats such as XML and JSON. However, many recent generative (probabilistic) models utilize neural networks to learn a probability distribution over undirected cyclic graphs. This assumption of a generic graph structure brings various computational challenges, and, more importantly, the presence of non-linearities in neural networks does not permit tractable probabilistic inference. We address these problems by proposing sum-product-set networks, an extension of probabilistic circuits from unstructured tensor data to tree-structured graph data. To this end, we use random finite sets to reflect a variable number of nodes and edges in the graph and to allow for exact and efficient inference. We demonstrate that our tractable model performs comparably to various intractable models based on neural networks.

LGAug 18, 2024
GraphSPNs: Sum-Product Networks Benefit From Canonical Orderings

Milan Papež, Martin Rektoris, Václav Šmídl et al.

Deep generative models have recently made a remarkable progress in capturing complex probability distributions over graphs. However, they are intractable and thus unable to answer even the most basic probabilistic inference queries without resorting to approximations. Therefore, we propose graph sum-product networks (GraphSPNs), a tractable deep generative model which provides exact and efficient inference over (arbitrary parts of) graphs. We investigate different principles to make SPNs permutation invariant. We demonstrate that GraphSPNs are able to (conditionally) generate novel and chemically valid molecular graphs, being competitive to, and sometimes even better than, existing intractable models. We find out that (Graph)SPNs benefit from ensuring the permutation invariance via canonical ordering.

LGFeb 9Code
GEMSS: A Variational Bayesian Method for Discovering Multiple Sparse Solutions in Classification and Regression Problems

Kateřina Henclová, Václav Šmídl

Selecting interpretable feature sets in underdetermined ($n \ll p$) and highly correlated regimes constitutes a fundamental challenge in data science, particularly when analyzing physical measurements. In such settings, multiple distinct sparse subsets may explain the response equally well. Identifying these alternatives is crucial for generating domain-specific insights into the underlying mechanisms, yet conventional methods typically isolate a single solution, obscuring the full spectrum of plausible explanations. We present GEMSS (Gaussian Ensemble for Multiple Sparse Solutions), a variational Bayesian framework specifically designed to simultaneously discover multiple, diverse sparse feature combinations. The method employs a structured spike-and-slab prior for sparsity, a mixture of Gaussians to approximate the intractable multimodal posterior, and a Jaccard-based penalty to further control solution diversity. Unlike sequential greedy approaches, GEMSS optimizes the entire ensemble of solutions within a single objective function via stochastic gradient descent. The method is validated on a comprehensive benchmark comprising 128 synthetic experiments across classification and regression tasks. Results demonstrate that GEMSS scales effectively to high-dimensional settings ($p=5000$) with sample size as small as $n = 50$, generalizes seamlessly to continuous targets, handles missing data natively, and exhibits remarkable robustness to class imbalance and Gaussian noise. GEMSS is available as a Python package 'gemss' at PyPI. The full GitHub repository at https://github.com/kat-er-ina/gemss/ also includes a free, easy-to-use application suitable for non-coders.

LGMar 15, 2025
Probabilistic Graph Circuits: Deep Generative Models for Tractable Probabilistic Inference over Graphs

Milan Papež, Martin Rektoris, Václav Šmídl et al.

Deep generative models (DGMs) have recently demonstrated remarkable success in capturing complex probability distributions over graphs. Although their excellent performance is attributed to powerful and scalable deep neural networks, it is, at the same time, exactly the presence of these highly non-linear transformations that makes DGMs intractable. Indeed, despite representing probability distributions, intractable DGMs deny probabilistic foundations by their inability to answer even the most basic inference queries without approximations or design choices specific to a very narrow range of queries. To address this limitation, we propose probabilistic graph circuits (PGCs), a framework of tractable DGMs that provide exact and efficient probabilistic inference over (arbitrary parts of) graphs. Nonetheless, achieving both exactness and efficiency is challenging in the permutation-invariant setting of graphs. We design PGCs that are inherently invariant and satisfy these two requirements, yet at the cost of low expressive power. Therefore, we investigate two alternative strategies to achieve the invariance: the first sacrifices the efficiency, and the second sacrifices the exactness. We demonstrate that ignoring the permutation invariance can have severe consequences in anomaly detection, and that the latter approach is competitive with, and sometimes better than, existing intractable DGMs in the context of molecular graph generation.

LGAug 11, 2025
Sparse Probabilistic Graph Circuits

Martin Rektoris, Milan Papež, Václav Šmídl et al.

Deep generative models (DGMs) for graphs achieve impressively high expressive power thanks to very efficient and scalable neural networks. However, these networks contain non-linearities that prevent analytical computation of many standard probabilistic inference queries, i.e., these DGMs are considered \emph{intractable}. While recently proposed Probabilistic Graph Circuits (PGCs) address this issue by enabling \emph{tractable} probabilistic inference, they operate on dense graph representations with $\mathcal{O}(n^2)$ complexity for graphs with $n$ nodes and \emph{$m$ edges}. To address this scalability issue, we introduce Sparse PGCs, a new class of tractable generative models that operate directly on sparse graph representation, reducing the complexity to $\mathcal{O}(n + m)$, which is particularly beneficial for $m \ll n^2$. In the context of de novo drug design, we empirically demonstrate that SPGCs retain exact inference capabilities, improve memory efficiency and inference speed, and match the performance of intractable DGMs in key metrics.

LGMay 8, 2023
Is AUC the best measure for practical comparison of anomaly detectors?

Vít Škvára, Tomáš Pevný, Václav Šmídl

The area under receiver operating characteristics (AUC) is the standard measure for comparison of anomaly detectors. Its advantage is in providing a scalar number that allows a natural ordering and is independent on a threshold, which allows to postpone the choice. In this work, we question whether AUC is a good metric for anomaly detection, or if it gives a false sense of comfort, due to relying on assumptions which are unlikely to hold in practice. Our investigation shows that variations of AUC emphasizing accuracy at low false positive rate seem to be better correlated with the needs of practitioners, but also that we can compare anomaly detectors only in the case when we have representative examples of anomalous samples. This last result is disturbing, as it suggests that in many cases, we should do active or few-show learning instead of pure anomaly detection.

LGOct 10, 2021
Fitting large mixture models using stochastic component selection

Milan Papež, Tomáš Pevný, Václav Šmídl

Traditional methods for unsupervised learning of finite mixture models require to evaluate the likelihood of all components of the mixture. This becomes computationally prohibitive when the number of components is large, as it is, for example, in the sum-product (transform) networks. Therefore, we propose to apply a combination of the expectation maximization and the Metropolis-Hastings algorithm to evaluate only a small number of, stochastically sampled, components, thus substantially reducing the computational cost. The Markov chain of component assignments is sequentially generated across the algorithm's iterations, having a non-stationary target distribution whose parameters vary via a gradient-descent scheme. We put emphasis on generality of our method, equipping it with the ability to train both shallow and deep mixture models which involve complex, and possibly nonlinear, transformations. The performance of our method is illustrated in a variety of synthetic and real-data contexts, considering deep models, such as mixtures of normalizing flows and sum-product (transform) networks.

LGDec 11, 2020
Comparison of Anomaly Detectors: Context Matters

Vít Škvára, Jan Franců, Matěj Zorek et al.

Deep generative models are challenging the classical methods in the field of anomaly detection nowadays. Every new method provides evidence of outperforming its predecessors, often with contradictory results. The objective of this comparison is twofold: to compare anomaly detection methods of various paradigms with focus on deep generative models, and identification of sources of variability that can yield different results. The methods were compared on popular tabular and image datasets. We identified the main sources of variability to be experimental conditions: i) the type data set (tabular or image) and the nature of anomalies (statistical or semantic), and ii) strategy of selection of hyperparameters, especially the number of available anomalies in the validation set. Different methods perform the best in different contexts, i.e. combination of experimental conditions together with computational time. This explains the variability of the previous results and highlights the importance of careful specification of the context in the publication of a new method. All our code and results are available for download.

LGJun 22, 2020
DeepTopPush: Simple and Scalable Method for Accuracy at the Top

Václav Mácha, Lukáš Adam, Václav Šmídl

Accuracy at the top is a special class of binary classification problems where the performance is evaluated only on a small number of relevant (top) samples. Applications include information retrieval systems or processes with manual (expensive) postprocessing. This leads to minimizing the number of irrelevant samples above a threshold. We consider classifiers in the form of an arbitrary (deep) network and propose a new method DeepTopPush for minimizing the loss function at the top. Since the threshold depends on all samples, the problem is non-decomposable. We modify the stochastic gradient descent to handle the non-decomposability in an end-to-end training manner and propose a way to estimate the threshold only from values on the current minibatch and one delayed value. We demonstrate the excellent performance of DeepTopPush on visual recognition datasets and two real-world applications. The first one selects a small number of molecules for further drug testing. The second one uses real malware data, where we detected 46\% malware at an extremely low false alarm rate of $10^{-5}$.

LGJun 2, 2020
Neural Power Units

Niklas Heim, Tomáš Pevný, Václav Šmídl

Conventional Neural Networks can approximate simple arithmetic operations, but fail to generalize beyond the range of numbers that were seen during training. Neural Arithmetic Units aim to overcome this difficulty, but current arithmetic units are either limited to operate on positive numbers or can only represent a subset of arithmetic operations. We introduce the Neural Power Unit (NPU) that operates on the full domain of real numbers and is capable of learning arbitrary power functions in a single layer. The NPU thus fixes the shortcomings of existing arithmetic units and extends their expressivity. We achieve this by using complex arithmetic without requiring a conversion of the network to complex numbers. A simplification of the unit to the RealNPU yields a highly transparent model. We show that the NPUs outperform their competitors in terms of accuracy and sparsity on artificial arithmetic datasets, and that the RealNPU can discover the governing equations of a dynamical system only from data.

LGFeb 26, 2020
Nonlinear classifiers for ranking problems based on kernelized SVM

Václav Mácha, Lukáš Adam, Václav Šmídl

Many classification problems focus on maximizing the performance only on the samples with the highest relevance instead of all samples. As an example, we can mention ranking problems, accuracy at the top or search engines where only the top few queries matter. In our previous work, we derived a general framework including several classes of these linear classification problems. In this paper, we extend the framework to nonlinear classifiers. Utilizing a similarity to SVM, we dualize the problems, add kernels and propose a componentwise dual ascent method.

LGFeb 25, 2020
General Framework for Binary Classification on Top Samples

Lukáš Adam, Václav Mácha, Václav Šmídl et al.

Many binary classification problems minimize misclassification above (or below) a threshold. We show that instances of ranking problems, accuracy at the top or hypothesis testing may be written in this form. We propose a general framework to handle these classes of problems and show which known methods (both known and newly proposed) fall into this framework. We provide a theoretical analysis of this framework and mention selected possible pitfalls the methods may encounter. We suggest several numerical improvements including the implicit derivative and stochastic gradient descent. We provide an extensive numerical study. Based both on the theoretical properties and numerical experiments, we conclude the paper by suggesting which method should be used in which situation.

MLDec 2, 2019
Rodent: Relevance determination in differential equations

Niklas Heim, Václav Šmídl, Tomáš Pevný

We aim to identify the generating, ordinary differential equation (ODE) from a set of trajectories of a partially observed system. Our approach does not need prescribed basis functions to learn the ODE model, but only a rich set of Neural Arithmetic Units. For maximal explainability of the learnt model, we minimise the state size of the ODE as well as the number of non-zero parameters that are needed to solve the problem. This sparsification is realized through a combination of the Variational Auto-Encoder (VAE) and Automatic Relevance Determination (ARD). We show that it is possible to learn not only one specific model for a single process, but a manifold of models representing harmonic signals as well as a manifold of Lotka-Volterra systems.

MLMay 28, 2019
Anomaly scores for generative models

Václav Šmídl, Jan Bím, Tomáš Pevný

Reconstruction error is a prevalent score used to identify anomalous samples when data are modeled by generative models, such as (variational) auto-encoders or generative adversarial networks. This score relies on the assumption that normal samples are located on a manifold and all anomalous samples are located outside. Since the manifold can be learned only where the training data lie, there are no guarantees how the reconstruction error behaves elsewhere and the score, therefore, seems to be ill-defined. This work defines an anomaly score that is theoretically compatible with generative models, and very natural for (variational) auto-encoders as they seem to be prevalent. The new score can be also used to select hyper-parameters and models. Finally, we explain why reconstruction error delivers good experimental results despite weak theoretical justification.

LGJul 13, 2018
Are generative deep models for novelty detection truly better?

Vít Škvára, Tomáš Pevný, Václav Šmídl

Many deep models have been recently proposed for anomaly detection. This paper presents comparison of selected generative deep models and classical anomaly detection methods on an extensive number of non--image benchmark datasets. We provide statistical comparison of the selected models, in many configurations, architectures and hyperparamaters. We arrive to conclusion that performance of the generative models is determined by the process of selection of their hyperparameters. Specifically, performance of the deep generative models deteriorates with decreasing amount of anomalous samples used in hyperparameter selection. In practical scenarios of anomaly detection, none of the deep generative models systematically outperforms the kNN.

MLMar 19, 2015
Non-parametric Bayesian Models of Response Function in Dynamic Image Sequences

Ondřej Tichý, Václav Šmídl

Estimation of response functions is an important task in dynamic medical imaging. This task arises for example in dynamic renal scintigraphy, where impulse response or retention functions are estimated, or in functional magnetic resonance imaging where hemodynamic response functions are required. These functions can not be observed directly and their estimation is complicated because the recorded images are subject to superposition of underlying signals. Therefore, the response functions are estimated via blind source separation and deconvolution. Performance of this algorithm heavily depends on the used models of the response functions. Response functions in real image sequences are rather complicated and finding a suitable parametric form is problematic. In this paper, we study estimation of the response functions using non-parametric Bayesian priors. These priors were designed to favor desirable properties of the functions, such as sparsity or smoothness. These assumptions are used within hierarchical priors of the blind source separation and deconvolution algorithm. Comparison of the resulting algorithms with these priors is performed on synthetic dataset as well as on real datasets from dynamic renal scintigraphy. It is shown that flexible non-parametric priors improve estimation of response functions in both cases. MATLAB implementation of the resulting algorithms is freely available for download.