Jacek Mańdziuk

AI
h-index8
37papers
824citations
Novelty38%
AI Score49

37 Papers

LGJul 8, 2022
StatMix: Data augmentation method that relies on image statistics in federated learning

Dominik Lewy, Jacek Mańdziuk, Maria Ganzha et al.

Availability of large amount of annotated data is one of the pillars of deep learning success. Although numerous big datasets have been made available for research, this is often not the case in real life applications (e.g. companies are not able to share data due to GDPR or concerns related to intellectual property rights protection). Federated learning (FL) is a potential solution to this problem, as it enables training a global model on data scattered across multiple nodes, without sharing local data itself. However, even FL methods pose a threat to data privacy, if not handled properly. Therefore, we propose StatMix, an augmentation approach that uses image statistics, to improve results of FL scenario(s). StatMix is empirically tested on CIFAR-10 and CIFAR-100, using two neural network architectures. In all FL experiments, application of StatMix improves the average accuracy, compared to the baseline training (with no use of StatMix). Some improvement can also be observed in non-FL setups.

LGJun 16, 2022
Using adversarial images to improve outcomes of federated learning for non-IID data

Anastasiya Danilenka, Maria Ganzha, Marcin Paprzycki et al.

One of the important problems in federated learning is how to deal with unbalanced data. This contribution introduces a novel technique designed to deal with label skewed non-IID data, using adversarial inputs, created by the I-FGSM method. Adversarial inputs guide the training process and allow the Weighted Federated Averaging to give more importance to clients with 'selected' local label distributions. Experimental results, gathered from image classification tasks, for MNIST and CIFAR-10 datasets, are reported and analyzed.

AINov 2, 2024Code
Reasoning Limitations of Multimodal Large Language Models. A Case Study of Bongard Problems

Mikołaj Małkiński, Szymon Pawlonka, Jacek Mańdziuk

Abstract visual reasoning (AVR) involves discovering shared concepts across images through analogy, akin to solving IQ test problems. Bongard Problems (BPs) remain a key challenge in AVR, requiring both visual reasoning and verbal description. We investigate whether multimodal large language models (MLLMs) can solve BPs by formulating a set of diverse MLLM-suited solution strategies and testing $4$ proprietary and $4$ open-access models on $3$ BP datasets featuring synthetic (classic BPs) and real-world (Bongard HOI and Bongard-OpenWorld) images. Despite some successes on real-world datasets, MLLMs struggle with synthetic BPs. To explore this gap, we introduce Bongard-RWR, a dataset representing synthetic BP concepts using real-world images. Our findings suggest that weak MLLM performance on classical BPs is not due to the domain specificity, but rather comes from their general AVR limitations. Code and dataset are available at: https://github.com/pavonism/bongard-rwr

CLSep 20, 2023
AttentionMix: Data augmentation method that relies on BERT attention mechanism

Dominik Lewy, Jacek Mańdziuk

The Mixup method has proven to be a powerful data augmentation technique in Computer Vision, with many successors that perform image mixing in a guided manner. One of the interesting research directions is transferring the underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP). Even though there already exist several methods that apply Mixup to textual data, there is still room for new, improved approaches. In this work, we introduce AttentionMix, a novel mixing method that relies on attention-based information. While the paper focuses on the BERT attention mechanism, the proposed approach can be applied to generally any attention-based model. AttentionMix is evaluated on 3 standard sentiment classification datasets and in all three cases outperforms two benchmark approaches that utilize Mixup mechanism, as well as the vanilla BERT method. The results confirm that the attention-based information can be effectively used for data augmentation in the NLP domain.

AIFeb 22
Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing

Maciej Świechowski, Adam Żychowski, Jacek Mańdziuk

This paper examines the reasoning capabilities of Large Language Models (LLMs) from a novel perspective, focusing on their ability to operate within formally specified, rule-governed environments. We evaluate four LLMs (Gemini 2.5 Pro and Flash variants, Llama 3.3 70B and GPT-OSS 120B) on a suite of forward-simulation tasks-including next / multistep state formulation, and legal action generation-across a diverse set of reasoning problems illustrated through General Game Playing (GGP) game instances. Beyond reporting instance-level performance, we characterize games based on 40 structural features and analyze correlations between these features and LLM performance. Furthermore, we investigate the effects of various game obfuscations to assess the role of linguistic semantics in game definitions and the impact of potential prior exposure of LLMs to specific games during training. The main results indicate that three of the evaluated models generally perform well across most experimental settings, with performance degradation observed as the evaluation horizon increases (i.e., with a higher number of game steps). Detailed case-based analysis of the LLM performance provides novel insights into common reasoning errors in the considered logic-based problem formulation, including hallucinated rules, redundant state facts, or syntactic errors. Overall, the paper reports clear progress in formal reasoning capabilities of contemporary models.

AIDec 15, 2023
One Self-Configurable Model to Solve Many Abstract Visual Reasoning Problems

Mikołaj Małkiński, Jacek Mańdziuk

Abstract Visual Reasoning (AVR) comprises a wide selection of various problems similar to those used in human IQ tests. Recent years have brought dynamic progress in solving particular AVR tasks, however, in the contemporary literature AVR problems are largely dealt with in isolation, leading to highly specialized task-specific methods. With the aim of developing universal learning systems in the AVR domain, we propose the unified model for solving Single-Choice Abstract visual Reasoning tasks (SCAR), capable of solving various single-choice AVR tasks, without making any a priori assumptions about the task structure, in particular the number and location of panels. The proposed model relies on a novel Structure-Aware dynamic Layer (SAL), which adapts its weights to the structure of the considered AVR problem. Experiments conducted on Raven's Progressive Matrices, Visual Analogy Problems, and Odd One Out problems show that SCAR (SAL-based models, in general) effectively solves diverse AVR tasks, and its performance is on par with the state-of-the-art task-specific baselines. What is more, SCAR demonstrates effective knowledge reuse in multi-task and transfer learning settings. To our knowledge, this work is the first successful attempt to construct a general single-choice AVR solver relying on self-configurable architecture and unified solving method. With this work we aim to stimulate and foster progress on task-independent research paths in the AVR domain, with the long-term goal of development of a general AVR solver.

LGFeb 2, 2025
Modified Adaptive Tree-Structured Parzen Estimator for Hyperparameter Optimization

Szymon Sieradzki, Jacek Mańdziuk

In this paper, we review hyperparameter optimization methods for machine learning models, with a particular focus on the Adaptive Tree-Structured Parzen Estimator (ATPE) algorithm. We propose several modifications to ATPE and assess their efficacy on a diverse set of standard benchmark functions. Experimental results demonstrate that the proposed modifications significantly improve the effectiveness of ATPE hyperparameter optimization on selected benchmarks, a finding that holds practical relevance for their application in real-world machine learning / optimization tasks.

NEJan 26, 2025
Constrained Hybrid Metaheuristic Algorithm for Probabilistic Neural Networks Learning

Piotr A. Kowalski, Szymon Kucharczyk, Jacek Mańdziuk

This study investigates the potential of hybrid metaheuristic algorithms to enhance the training of Probabilistic Neural Networks (PNNs) by leveraging the complementary strengths of multiple optimisation strategies. Traditional learning methods, such as gradient-based approaches, often struggle to optimise high-dimensional and uncertain environments, while single-method metaheuristics may fail to exploit the solution space fully. To address these challenges, we propose the constrained Hybrid Metaheuristic (cHM) algorithm, a novel approach that combines multiple population-based optimisation techniques into a unified framework. The proposed procedure operates in two phases: an initial probing phase evaluates multiple metaheuristics to identify the best-performing one based on the error rate, followed by a fitting phase where the selected metaheuristic refines the PNN to achieve optimal smoothing parameters. This iterative process ensures efficient exploration and convergence, enhancing the network's generalisation and classification accuracy. cHM integrates several popular metaheuristics, such as BAT, Simulated Annealing, Flower Pollination Algorithm, Bacterial Foraging Optimization, and Particle Swarm Optimisation as internal optimisers. To evaluate cHM performance, experiments were conducted on 16 datasets with varying characteristics, including binary and multiclass classification tasks, balanced and imbalanced class distributions, and diverse feature dimensions. The results demonstrate that cHM effectively combines the strengths of individual metaheuristics, leading to faster convergence and more robust learning. By optimising the smoothing parameters of PNNs, the proposed method enhances classification performance across diverse datasets, proving its application flexibility and efficiency.

LGDec 14, 2023
Coevolutionary Algorithm for Building Robust Decision Trees under Minimax Regret

Adam Żychowski, Andrew Perrault, Jacek Mańdziuk

In recent years, there has been growing interest in developing robust machine learning (ML) models that can withstand adversarial attacks, including one of the most widely adopted, efficient, and interpretable ML algorithms-decision trees (DTs). This paper proposes a novel coevolutionary algorithm (CoEvoRDT) designed to create robust DTs capable of handling noisy high-dimensional data in adversarial contexts. Motivated by the limitations of traditional DT algorithms, we leverage adaptive coevolution to allow DTs to evolve and learn from interactions with perturbed input data. CoEvoRDT alternately evolves competing populations of DTs and perturbed features, enabling construction of DTs with desired properties. CoEvoRDT is easily adaptable to various target metrics, allowing the use of tailored robustness criteria such as minimax regret. Furthermore, CoEvoRDT has potential to improve the results of other state-of-the-art methods by incorporating their outcomes (DTs they produce) into the initial population and optimize them in the process of coevolution. Inspired by the game theory, CoEvoRDT utilizes mixed Nash equilibrium to enhance convergence. The method is tested on 20 popular datasets and shows superior performance compared to 4 state-of-the-art algorithms. It outperformed all competing methods on 13 datasets with adversarial accuracy metrics, and on all 20 considered datasets with minimax regret. Strong experimental results and flexibility in choosing the error measure make CoEvoRDT a promising approach for constructing robust DTs in real-world applications.

LGMay 18, 2024
ReModels: Quantile Regression Averaging models

Grzegorz Zakrzewski, Kacper Skonieczka, Mikołaj Małkiński et al.

Electricity price forecasts play a crucial role in making key business decisions within the electricity markets. A focal point in this domain are probabilistic predictions, which delineate future price values in a more comprehensive manner than simple point forecasts. The golden standard in probabilistic approaches to predict energy prices is the Quantile Regression Averaging (QRA) method. In this paper, we present a Python package that encompasses the implementation of QRA, along with modifications of this approach that have appeared in the literature over the past few years. The proposed package also facilitates the acquisition and preparation of data related to electricity markets, as well as the evaluation of model predictions.

AIMay 19, 2025
Advancing Generalization Across a Variety of Abstract Visual Reasoning Tasks

Mikołaj Małkiński, Jacek Mańdziuk

The abstract visual reasoning (AVR) domain presents a diverse suite of analogy-based tasks devoted to studying model generalization. Recent years have brought dynamic progress in the field, particularly in i.i.d. scenarios, in which models are trained and evaluated on the same data distributions. Nevertheless, o.o.d. setups that assess model generalization to new test distributions remain challenging even for the most recent models. To advance generalization in AVR tasks, we present the Pathways of Normalized Group Convolution model (PoNG), a novel neural architecture that features group convolution, normalization, and a parallel design. We consider a wide set of AVR benchmarks, including Raven's Progressive Matrices and visual analogy problems with both synthetic and real-world images. The experiments demonstrate strong generalization capabilities of the proposed model, which in several settings outperforms the existing literature methods.

LGMay 10, 2024
Interpretable Multi-task Learning with Shared Variable Embeddings

Maciej Żelaszczyk, Jacek Mańdziuk

This paper proposes a general interpretable predictive system with shared information. The system is able to perform predictions in a multi-task setting where distinct tasks are not bound to have the same input/output structure. Embeddings of input and output variables in a common space are obtained, where the input embeddings are produced through attending to a set of shared embeddings, reused across tasks. All the embeddings are treated as model parameters and learned. Specific restrictions on the space of shared embedings and the sparsity of the attention mechanism are considered. Experiments show that the introduction of shared embeddings does not deteriorate the results obtained from a vanilla variable embeddings method. We run a number of further ablations. Inducing sparsity in the attention mechanism leads to both an increase in accuracy and a significant decrease in the number of training steps required. Shared embeddings provide a measure of interpretability in terms of both a qualitative assessment and the ability to map specific shared embeddings to pre-defined concepts that are not tailored to the considered model. There seems to be a trade-off between accuracy and interpretability. The basic shared embeddings method favors interpretability, whereas the sparse attention method promotes accuracy. The results lead to the conclusion that variable embedding methods may be extended with shared information to provide increased interpretability and accuracy.

LGNov 17, 2025
The Impact of Bootstrap Sampling Rate on Random Forest Performance in Regression Tasks

Michał Iwaniuk, Mateusz Jarosz, Bartłomiej Borycki et al.

Random Forests (RFs) typically train each tree on a bootstrap sample of the same size as the training set, i.e., bootstrap rate (BR) equals 1.0. We systematically examine how varying BR from 0.2 to 5.0 affects RF performance across 39 heterogeneous regression datasets and 16 RF configurations, evaluating with repeated two-fold cross-validation and mean squared error. Our results demonstrate that tuning the BR can yield significant improvements over the default: the best setup relied on BR \leq 1.0 for 24 datasets, BR > 1.0 for 15, and BR = 1.0 was optimal in 4 cases only. We establish a link between dataset characteristics and the preferred BR: datasets with strong global feature-target relationships favor higher BRs, while those with higher local target variance benefit from lower BRs. To further investigate this relationship, we conducted experiments on synthetic datasets with controlled noise levels. These experiments reproduce the observed bias-variance trade-off: in low-noise scenarios, higher BRs effectively reduce model bias, whereas in high-noise settings, lower BRs help reduce model variance. Overall, BR is an influential hyperparameter that should be tuned to optimize RF regression models.

AIAug 16, 2025
Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems

Szymon Pawlonka, Mikołaj Małkiński, Jacek Mańdziuk

Bongard Problems (BPs) provide a challenging testbed for abstract visual reasoning (AVR), requiring models to identify visual concepts fromjust a few examples and describe them in natural language. Early BP benchmarks featured synthetic black-and-white drawings, which might not fully capture the complexity of real-world scenes. Subsequent BP datasets employed real-world images, albeit the represented concepts are identifiable from high-level image features, reducing the task complexity. Differently, the recently released Bongard-RWR dataset aimed at representing abstract concepts formulated in the original BPs using fine-grained real-world images. Its manual construction, however, limited the dataset size to just $60$ instances, constraining evaluation robustness. In this work, we introduce Bongard-RWR+, a BP dataset composed of $5\,400$ instances that represent original BP abstract concepts using real-world-like images generated via a vision language model (VLM) pipeline. Building on Bongard-RWR, we employ Pixtral-12B to describe manually curated images and generate new descriptions aligned with the underlying concepts, use Flux.1-dev to synthesize images from these descriptions, and manually verify that the generated images faithfully reflect the intended concepts. We evaluate state-of-the-art VLMs across diverse BP formulations, including binary and multiclass classification, as well as textual answer generation. Our findings reveal that while VLMs can recognize coarse-grained visual concepts, they consistently struggle with discerning fine-grained concepts, highlighting limitations in their reasoning capabilities.

SDAug 7, 2025
Training chord recognition models on artificially generated audio

Martyna Majchrzak, Jacek Mańdziuk

One of the challenging problems in Music Information Retrieval is the acquisition of enough non-copyrighted audio recordings for model training and evaluation. This study compares two Transformer-based neural network models for chord sequence recognition in audio recordings and examines the effectiveness of using an artificially generated dataset for this purpose. The models are trained on various combinations of Artificial Audio Multitracks (AAM), Schubert's Winterreise Dataset, and the McGill Billboard Dataset and evaluated with three metrics: Root, MajMin and Chord Content Metric (CCM). The experiments prove that even though there are certainly differences in complexity and structure between artificially generated and human-composed music, the former can be useful in certain scenarios. Specifically, AAM can enrich a smaller training dataset of music composed by a human or can even be used as a standalone training set for a model that predicts chord sequences in pop music, if no other data is available.

LGDec 18, 2024
Cultivating Archipelago of Forests: Evolving Robust Decision Trees through Island Coevolution

Adam Żychowski, Andrew Perrault, Jacek Mańdziuk

Decision trees are widely used in machine learning due to their simplicity and interpretability, but they often lack robustness to adversarial attacks and data perturbations. The paper proposes a novel island-based coevolutionary algorithm (ICoEvoRDF) for constructing robust decision tree ensembles. The algorithm operates on multiple islands, each containing populations of decision trees and adversarial perturbations. The populations on each island evolve independently, with periodic migration of top-performing decision trees between islands. This approach fosters diversity and enhances the exploration of the solution space, leading to more robust and accurate decision tree ensembles. ICoEvoRDF utilizes a popular game theory concept of mixed Nash equilibrium for ensemble weighting, which further leads to improvement in results. ICoEvoRDF is evaluated on 20 benchmark datasets, demonstrating its superior performance compared to state-of-the-art methods in optimizing both adversarial accuracy and minimax regret. The flexibility of ICoEvoRDF allows for the integration of decision trees from various existing methods, providing a unified framework for combining diverse solutions. Our approach offers a promising direction for developing robust and interpretable machine learning models

AIJun 16, 2024
A Unified View of Abstract Visual Reasoning Problems

Mikołaj Małkiński, Jacek Mańdziuk

The field of Abstract Visual Reasoning (AVR) encompasses a wide range of problems, many of which are inspired by human IQ tests. The variety of AVR tasks has resulted in state-of-the-art AVR methods being task-specific approaches. Furthermore, contemporary methods consider each AVR problem instance not as a whole, but in the form of a set of individual panels with particular locations and roles (context vs. answer panels) pre-assigned according to the task-specific arrangements. While these highly specialized approaches have recently led to significant progress in solving particular AVR tasks, considering each task in isolation hinders the development of universal learning systems in this domain. In this paper, we introduce a unified view of AVR tasks, where each problem instance is rendered as a single image, with no a priori assumptions about the number of panels, their location, or role. The main advantage of the proposed unified view is the ability to develop universal learning models applicable to various AVR tasks. What is more, the proposed approach inherently facilitates transfer learning in the AVR domain, as various types of problems share a common representation. The experiments conducted on four AVR datasets with Raven's Progressive Matrices and Visual Analogy Problems, and one real-world visual analogy dataset show that the proposed unified representation of AVR tasks poses a challenge to state-of-the-art Deep Learning (DL) AVR models and, more broadly, contemporary DL image recognition methods. In order to address this challenge, we introduce the Unified Model for Abstract Visual Reasoning (UMAVR) capable of dealing with various types of AVR problems in a unified manner. UMAVR outperforms existing AVR methods in selected single-task learning experiments, and demonstrates effective knowledge reuse in transfer learning and curriculum learning setups.

AIJun 16, 2024
A-I-RAVEN and I-RAVEN-Mesh: Two New Benchmarks for Abstract Visual Reasoning

Mikołaj Małkiński, Jacek Mańdziuk

We study generalization and knowledge reuse capabilities of deep neural networks in the domain of abstract visual reasoning (AVR), employing Raven's Progressive Matrices (RPMs), a recognized benchmark task for assessing AVR abilities. Two knowledge transfer scenarios referring to the I-RAVEN dataset are investigated. Firstly, inspired by generalization assessment capabilities of the PGM dataset and popularity of I-RAVEN, we introduce Attributeless-I-RAVEN (A-I-RAVEN), a benchmark with 10 generalization regimes that allow to systematically test generalization of abstract rules applied to held-out attributes at various levels of complexity (primary and extended regimes). In contrast to PGM, A-I-RAVEN features compositionality, a variety of figure configurations, and does not require substantial computational resources. Secondly, we construct I-RAVEN-Mesh, a dataset that enriches RPMs with a novel component structure comprising line-based patterns, facilitating assessment of progressive knowledge acquisition in transfer learning setting. We evaluate 13 strong models from the AVR literature on the introduced datasets, revealing their specific shortcomings in generalization and knowledge transfer.

CVJan 21, 2024
Text-to-Image Cross-Modal Generation: A Systematic Review

Maciej Żelaszczyk, Jacek Mańdziuk

We review research on generating visual data from text from the angle of "cross-modal generation." This point of view allows us to draw parallels between various methods geared towards working on input text and producing visual output, without limiting the analysis to narrow sub-areas. It also results in the identification of common templates in the field, which are then compared and contrasted both within pools of similar methods and across lines of research. We provide a breakdown of text-to-image generation into various flavors of image-from-text methods, video-from-text methods, image editing, self-supervised and graph-based approaches. In this discussion, we focus on research papers published at 8 leading machine learning conferences in the years 2016-2022, also incorporating a number of relevant papers not matching the outlined search criteria. The conducted review suggests a significant increase in the number of papers published in the area and highlights research gaps and potential lines of investigation. To our knowledge, this is the first review to systematically look at text-to-image generation from the perspective of "cross-modal generation."

AIFeb 21, 2022
A Review of Emerging Research Directions in Abstract Visual Reasoning

Mikołaj Małkiński, Jacek Mańdziuk

Abstract Visual Reasoning (AVR) problems are commonly used to approximate human intelligence. They test the ability of applying previously gained knowledge, experience and skills in a completely new setting, which makes them particularly well-suited for this task. Recently, the AVR problems have become popular as a proxy to study machine intelligence, which has led to emergence of new distinct types of problems and multiple benchmark sets. In this work we review this emerging AVR research and propose a taxonomy to categorise the AVR tasks along 5 dimensions: input shapes, hidden rules, target task, cognitive function, and main challenge. The perspective taken in this survey allows to characterise AVR problems with respect to their shared and distinct properties, provides a unified view on the existing approaches for solving AVR tasks, shows how the AVR problems relate to practical applications, and outlines promising directions for future work. One of them refers to the observation that in the machine learning literature different tasks are considered in isolation, which is in the stark contrast with the way the AVR tasks are used to measure human intelligence, where multiple types of problems are combined within a single IQ test.

AIJan 28, 2022
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices

Mikołaj Małkiński, Jacek Mańdziuk

Abstract visual reasoning (AVR) domain encompasses problems solving which requires the ability to reason about relations among entities present in a given scene. While humans, generally, solve AVR tasks in a "natural" way, even without prior experience, this type of problems has proven difficult for current machine learning systems. The paper summarises recent progress in applying deep learning methods to solving AVR problems, as a proxy for studying machine intelligence. We focus on the most common type of AVR tasks -- the Raven's Progressive Matrices (RPMs) -- and provide a comprehensive review of the learning methods and deep neural models applied to solve RPMs, as well as, the RPM benchmark sets. Performance analysis of the state-of-the-art approaches to solving RPMs leads to formulation of certain insights and remarks on the current and future trends in this area. We conclude the paper by demonstrating how real-world problems can benefit from the discoveries of RPM studies.

LGOct 5, 2021
Adversarial defenses via a mixture of generators

Maciej Żelaszczyk, Jacek Mańdziuk

In spite of the enormous success of neural networks, adversarial examples remain a relatively weakly understood feature of deep learning systems. There is a considerable effort in both building more powerful adversarial attacks and designing methods to counter the effects of adversarial examples. We propose a method to transform the adversarial input data through a mixture of generators in order to recover the correct class obfuscated by the adversarial attack. A canonical set of images is used to generate adversarial examples through potentially multiple attacks. Such transformed images are processed by a set of generators, which are trained adversarially as a whole to compete in inverting the initial transformations. To our knowledge, this is the first use of a mixture-based adversarially trained system as a defense mechanism. We show that it is possible to train such a system without supervision, simultaneously on multiple adversarial attacks. Our system is able to recover class information for previously-unseen examples with neither attack nor data labels on the MNIST dataset. The results demonstrate that this multi-attack approach is competitive with adversarial defenses tested in single-attack settings.

CVSep 28, 2021
Prediction of the Facial Growth Direction is Challenging

Stanisław Kaźmierczak, Zofia Juszka, Vaska Vandevska-Radunovic et al.

Facial dysmorphology or malocclusion is frequently associated with abnormal growth of the face. The ability to predict facial growth (FG) direction would allow clinicians to prepare individualized therapy to increase the chance for successful treatment. Prediction of FG direction is a novel problem in the machine learning (ML) domain. In this paper, we perform feature selection and point the attribute that plays a central role in the abovementioned problem. Then we successfully apply data augmentation (DA) methods and improve the previously reported classification accuracy by 2.81%. Finally, we present the results of two experienced clinicians that were asked to solve a similar task to ours and show how tough is solving this problem for human experts.

MMSep 27, 2021
Audio-to-Image Cross-Modal Generation

Maciej Żelaszczyk, Jacek Mańdziuk

Cross-modal representation learning allows to integrate information from different modalities into one representation. At the same time, research on generative models tends to focus on the visual domain with less emphasis on other domains, such as audio or text, potentially missing the benefits of shared representations. Studies successfully linking more than one modality in the generative setting are rare. In this context, we verify the possibility to train variational autoencoders (VAEs) to reconstruct image archetypes from audio data. Specifically, we consider VAEs in an adversarial training framework in order to ensure more variability in the generated data and find that there is a trade-off between the consistency and diversity of the generated images - this trade-off can be governed by scaling the reconstruction loss up or down, respectively. Our results further suggest that even in the case when the generated images are relatively inconsistent (diverse), features that are critical for proper image classification are preserved.

GTSep 27, 2021
Learning Attacker's Bounded Rationality Model in Security Games

Adam Żychowski, Jacek Mańdziuk

The paper proposes a novel neuroevolutionary method (NESG) for calculating leader's payoff in Stackelberg Security Games. The heart of NESG is strategy evaluation neural network (SENN). SENN is able to effectively evaluate leader's strategies against an opponent who may potentially not behave in a perfectly rational way due to certain cognitive biases or limitations. SENN is trained on historical data and does not require any direct prior knowledge regarding the follower's target preferences, payoff distribution or bounded rationality model. NESG was tested on a set of 90 benchmark games inspired by real-world cybersecurity scenario known as deep packet inspections. Experimental results show an advantage of applying NESG over the existing state-of-the-art methods when playing against not perfectly rational opponents. The method provides high quality solutions with superior computation time scalability. Due to generic and knowledge-free construction of NESG, the method may be applied to various real-life security scenarios.

CVJul 21, 2021
An overview of mixing augmentation methods and augmentation strategies

Dominik Lewy, Jacek Mańdziuk

Deep Convolutional Neural Networks have made an incredible progress in many Computer Vision tasks. This progress, however, often relies on the availability of large amounts of the training data, required to prevent over-fitting, which in many domains entails significant cost of manual data labeling. An alternative approach is application of data augmentation (DA) techniques that aim at model regularization by creating additional observations from the available ones. This survey focuses on two DA research streams: image mixing and automated selection of augmentation strategies. First, the presented methods are briefly described, and then qualitatively compared with respect to their key characteristics. Various quantitative comparisons are also included based on the results reported in recent DA literature. This review mainly covers the methods published in the materials of top-tier conferences and in leading journals in the years 2017-2021.

LGJun 19, 2021
Prediction of the facial growth direction with Machine Learning methods

Stanisław Kaźmierczak, Zofia Juszka, Piotr Fudalej et al.

First attempts of prediction of the facial growth (FG) direction were made over half of a century ago. Despite numerous attempts and elapsed time, a satisfactory method has not been established yet and the problem still poses a challenge for medical experts. To our knowledge, this paper is the first Machine Learning approach to the prediction of FG direction. Conducted data analysis reveals the inherent complexity of the problem and explains the reasons of difficulty in FG direction prediction based on 2D X-ray images. To perform growth forecasting, we employ a wide range of algorithms, from logistic regression, through tree ensembles to neural networks and consider three, slightly different, problem formulations. The resulting classification accuracy varies between 71% and 75%.

AIMar 8, 2021
Monte Carlo Tree Search: A Review of Recent Modifications and Applications

Maciej Świechowski, Konrad Godlewski, Bartosz Sawicki et al.

Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-playing bots or solving sequential decision problems. The method relies on intelligent tree search that balances exploration and exploitation. MCTS performs random sampling in the form of simulations and stores statistics of actions to make more educated choices in each subsequent iteration. The method has become a state-of-the-art technique for combinatorial games, however, in more complex games (e.g. those with high branching factor or real-time ones), as well as in various practical domains (e.g. transportation, scheduling or security) an efficient MCTS application often requires its problem-dependent modification or integration with other techniques. Such domain-specific modifications and hybrid approaches are the main focus of this survey. The last major MCTS survey has been published in 2012. Contributions that appeared since its release are of particular interest for this review.

AIDec 3, 2020
Multi-Label Contrastive Learning for Abstract Visual Reasoning

Mikołaj Małkiński, Jacek Mańdziuk

For a long time the ability to solve abstract reasoning tasks was considered one of the hallmarks of human intelligence. Recent advances in application of deep learning (DL) methods led, as in many other domains, to surpassing human abstract reasoning performance, specifically in the most popular type of such problems - the Raven's Progressive Matrices (RPMs). While the efficacy of DL systems is indeed impressive, the way they approach the RPMs is very different from that of humans. State-of-the-art systems solving RPMs rely on massive pattern-based training and sometimes on exploiting biases in the dataset, whereas humans concentrate on identification of the rules / concepts underlying the RPM (or generally a visual reasoning task) to be solved. Motivated by this cognitive difference, this work aims at combining DL with human way of solving RPMs and getting the best of both worlds. Specifically, we cast the problem of solving RPMs into multi-label classification framework where each RPM is viewed as a multi-label data point, with labels determined by the set of abstract rules underlying the RPM. For efficient training of the system we introduce a generalisation of the Noise Contrastive Estimation algorithm to the case of multi-label samples. Furthermore, we propose a new sparse rule encoding scheme for RPMs which, besides the new training algorithm, is the key factor contributing to the state-of-the-art performance. The proposed approach is evaluated on two most popular benchmark datasets (Balanced-RAVEN and PGM) and on both of them demonstrates an advantage over the current state-of-the-art results. Contrary to applications of contrastive learning methods reported in other domains, the state-of-the-art performance reported in the paper is achieved with no need for large batch sizes or strong data augmentation.

NEJun 26, 2020
Biologically Plausible Learning of Text Representation with Spiking Neural Networks

Marcin Białas, Marcin Michał Mirończuk, Jacek Mańdziuk

This study proposes a novel biologically plausible mechanism for generating low-dimensional spike-based text representation. First, we demonstrate how to transform documents into series of spikes spike trains which are subsequently used as input in the training process of a spiking neural network (SNN). The network is composed of biologically plausible elements, and trained according to the unsupervised Hebbian learning rule, Spike-Timing-Dependent Plasticity (STDP). After training, the SNN can be used to generate low-dimensional spike-based text representation suitable for text/document classification. Empirical results demonstrate that the generated text representation may be effectively used in text classification leading to an accuracy of $80.19\%$ on the bydate version of the 20 newsgroups data set, which is a leading result amongst approaches that rely on low-dimensional text representations.

NEJun 15, 2020
Dynamic Vehicle Routing Problem: A Monte Carlo approach

Michał Okulewicz, Jacek Mańdziuk

In this work we solve the Dynamic Vehicle Routing Problem (DVRP). DVRP is a modification of the Vehicle Routing Problem, in which the clients' requests (cities) number and location might not be known at the beginning of the working day Additionally, all requests must be served during one working day by a fleet of vehicles with limited capacity. In this work we propose a Monte Carlo method (MCTree), which directly approaches the dynamic nature of arriving requests in the DVRP. The method is also hybridized (MCTree+PSO) with our previous Two-Phase Multi-swarm Particle Swarm Optimization (2MPSO) algorithm. Our method is based on two assumptions. First, that we know a bounding rectangle of the area in which the requests might appear. Second, that the initial requests' sizes and frequency of appearance are representative for the yet unknown clients' requests. In order to solve the DVRP we divide the working day into several time slices in which we solve a static problem. In our Monte Carlo approach we randomly generate the unknown clients' requests with uniform spatial distribution over the bounding rectangle and requests' sizes uniformly sampled from the already known requests' sizes. The solution proposal is constructed with the application of a clustering algorithm and a route construction algorithm. The MCTree method is tested on a well established set of benchmarks proposed by Kilby et al. and is compared with the results achieved by applying our previous 2MPSO algorithm and other literature results. The proposed MCTree approach achieves a better time to quality trade-off then plain heuristic algorithms. Moreover, a hybrid MCTree+PSO approach achieves better time to quality trade-off then 2MPSO for small optimization time limits, making the hybrid a good candidate for handling real world scale goods delivery problems.

NEJun 15, 2020
A Particle Swarm Optimization hyper-heuristic for the Dynamic Vehicle Routing Problem

Michał Okulewicz, Jacek Mańdziuk

This paper presents a method for choosing a Particle Swarm Optimization based optimizer for the Dynamic Vehicle Routing Problem on the basis of the initially available data of a given problem instance. The optimization algorithm is chosen on the basis of a prediction made by a linear model trained on that data and the relative results obtained by the optimization algorithms. The achieved results suggest that such a model can be used in a hyper-heuristic approach as it improved the average results, obtained on the set of benchmark instances, by choosing the appropriate algorithm in 82% of significant cases. Two leading multi-swarm Particle Swarm Optimization based algorithms for solving the Dynamic Vehicle Routing Problem are used as the basic optimization algorithms: Khouadjia's et al. Multi-Environmental Multi-Swarm Optimizer and authors' 2--Phase Multiswarm Particle Swarm Optimization.

CVApr 19, 2020
A Committee of Convolutional Neural Networks for Image Classication in the Concurrent Presence of Feature and Label Noise

Stanisław Kaźmierczak, Jacek Mańdziuk

Image classification has become a ubiquitous task. Models trained on good quality data achieve accuracy which in some application domains is already above human-level performance. Unfortunately, real-world data are quite often degenerated by the noise existing in features and/or labels. There are quite many papers that handle the problem of either feature or label noise, separately. However, to the best of our knowledge, this piece of research is the first attempt to address the problem of concurrent occurrence of both types of noise. Basing on the MNIST, CIFAR-10 and CIFAR-100 datasets, we experimentally proved that the difference by which committees beat single models increases along with noise level, no matter it is an attribute or label disruption. Thus, it makes ensembles legitimate to be applied to noisy images with noisy labels. The aforementioned committees' advantage over single models is positively correlated with dataset difficulty level as well. We propose three committee selection algorithms that outperform a strong baseline algorithm which relies on an ensemble of individual (nonassociated) best models.

NEFeb 28, 2020
Generalized Self-Adapting Particle Swarm Optimization algorithm with archive of samples

Michał Okulewicz, Mateusz Zaborski, Jacek Mańdziuk

In this paper we enhance Generalized Self-Adapting Particle Swarm Optimization algorithm (GAPSO), initially introduced at the Parallel Problem Solving from Nature 2018 conference, and to investigate its properties. The research on GAPSO is underlined by the two following assumptions: (1) it is possible to achieve good performance of an optimization algorithm through utilization of all of the gathered samples, (2) the best performance can be accomplished by means of a combination of specialized sampling behaviors (Particle Swarm Optimization, Differential Evolution, and locally fitted square functions). From a software engineering point of view, GAPSO considers a standard Particle Swarm Optimization algorithm as an ideal starting point for creating a generalpurpose global optimization framework. Within this framework hybrid optimization algorithms are developed, and various additional techniques (like algorithm restart management or adaptation schemes) are tested. The paper introduces a new version of the algorithm, abbreviated as M-GAPSO. In comparison with the original GAPSO formulation it includes the following four features: a global restart management scheme, samples gathering within an R-Tree based index (archive/memory of samples), adaptation of a sampling behavior based on a global particle performance, and a specific approach to local search. The above-mentioned enhancements resulted in improved performance of M-GAPSO over GAPSO, observed on both COCO BBOB testbed and in the black-box optimization competition BBComp. Also, for lower dimensionality functions (up to 5D) results of M-GAPSO are better or comparable to the state-of-the art version of CMA-ES (namely the KL-BIPOP-CMA-ES algorithm presented at the GECCO 2017 conference).

GTDec 7, 2019
Anchoring Theory in Sequential Stackelberg Games

Jan Karwowski, Jacek Mańdziuk, Adam Żychowski

An underlying assumption of Stackelberg Games (SGs) is perfect rationality of the players. However, in real-life situations (which are often modeled by SGs) the followers (terrorists, thieves, poachers or smugglers) -- as humans in general -- may act not in a perfectly rational way, as their decisions may be affected by biases of various kinds which bound rationality of their decisions. One of the popular models of bounded rationality (BR) is Anchoring Theory (AT) which claims that humans have a tendency to flatten probabilities of available options, i.e. they perceive a distribution of these probabilities as being closer to the uniform distribution than it really is. This paper proposes an efficient formulation of AT in sequential extensive-form SGs (named ATSG), suitable for Mixed-Integer Linear Program (MILP) solution methods. ATSG is implemented in three MILP/LP-based state-of-the-art methods for solving sequential SGs and two recently introduced non-MILP approaches: one relying on Monte Carlo sampling (O2UCT) and the other one (EASG) employing Evolutionary Algorithms. Experimental evaluation indicates that both non-MILP heuristic approaches scale better in time than MILP solutions while providing optimal or close-to-optimal solutions. Except for competitive time scalability, an additional asset of non-MILP methods is flexibility of potential BR formulations they are able to incorporate. While MILP approaches accept BR formulations with linear constraints only, no restrictions on the BR form are imposed in either of the two non-MILP methods.

GTSep 9, 2019
Double-oracle sampling method for Stackelberg Equilibrium approximation in general-sum extensive-form games

Jan Karwowski, Jacek Mańdziuk

The paper presents a new method for approximating Strong Stackelberg Equilibrium in general-sum sequential games with imperfect information and perfect recall. The proposed approach is generic as it does not rely on any specific properties of a particular game model. The method is based on iterative interleaving of the two following phases: (1) guided Monte Carlo Tree Search sampling of the Follower's strategy space and (2) building the Leader's behavior strategy tree for which the sampled Follower's strategy is an optimal response. The above solution scheme is evaluated with respect to expected Leader's utility and time requirements on three sets of interception games with variable characteristics, played on graphs. A comparison with three state-of-the-art MILP/LP-based methods shows that in vast majority of test cases proposed simulation-based approach leads to optimal Leader's strategies, while excelling the competitive methods in terms of better time scalability and lower memory requirements.

NENov 17, 2017
Addressing Expensive Multi-objective Games with Postponed Preference Articulation via Memetic Co-evolution

Adam Żychowski, Abhishek Gupta, Jacek Mańdziuk et al.

This paper presents algorithmic and empirical contributions demonstrating that the convergence characteristics of a co-evolutionary approach to tackle Multi-Objective Games (MOGs) with postponed preference articulation can often be hampered due to the possible emergence of the so-called Red Queen effect. Accordingly, it is hypothesized that the convergence characteristics can be significantly improved through the incorporation of memetics (local solution refinements as a form of lifelong learning), as a promising means of mitigating (or at least suppressing) the Red Queen phenomenon by providing a guiding hand to the purely genetic mechanisms of co-evolution. Our practical motivation is to address MOGs of a time-sensitive nature that are characterized by computationally expensive evaluations, wherein there is a natural need to reduce the total number of true function evaluations consumed in achieving good quality solutions. To this end, we propose novel enhancements to co-evolutionary approaches for tackling MOGs, such that memetic local refinements can be efficiently applied on evolved candidate strategies by searching on computationally cheap surrogate payoff landscapes (that preserve postponed preference conditions). The efficacy of the proposal is demonstrated on a suite of test MOGs that have been designed.