Krzysztof Krawiec

CV
h-index2
13papers
48citations
Novelty53%
AI Score53

13 Papers

42.2CVMay 26
Structure over Pixels: Learning Variable-Length Visual Programs

Piotr Wyrwiński, Kacper Dobek, Krzysztof Krawiec

Discrete visual tokenizers translate images into ordered sequences of codes, providing a natural representation for structural description of scenes. Yet existing adaptive tokenizers either require post-hoc search or select among a discrete set of pre-trained rates, rather than learning a continuous per-image sequence length coupled to the model and scene, and they typically train against pixel reconstruction, emphasizing texture rather than structure. We propose STROP, a discrete visual tokenizer architecture that forms structural scene representations and simultaneously learns how long an image's visual program should be. Using a four-phase curriculum supervised by local rate--distortion probes against frozen DINOv3 features, STROP optimizes a dedicated length head that estimates the active prefix length in a single forward pass. By bypassing pixel-level reconstruction gradients, the codebook is shaped entirely by the quality of higher-level latent representations. Program length grows with scene complexity, and signs of compositional structure emerge both in downstream dense-prediction transfer and in direct inspection of the learned code vocabulary.

AIAug 12, 2023
Learning Abstract Visual Reasoning via Task Decomposition: A Case Study in Raven Progressive Matrices

Jakub Kwiatkowski, Krzysztof Krawiec

Learning to perform abstract reasoning often requires decomposing the task in question into intermediate subgoals that are not specified upfront, but need to be autonomously devised by the learner. In Raven Progressive Matrices (RPM), the task is to choose one of the available answers given a context, where both the context and answers are composite images featuring multiple objects in various spatial arrangements. As this high-level goal is the only guidance available, learning to solve RPMs is challenging. In this study, we propose a deep learning architecture based on the transformer blueprint which, rather than directly making the above choice, addresses the subgoal of predicting the visual properties of individual objects and their arrangements. The multidimensional predictions obtained in this way are then directly juxtaposed to choose the answer. We consider a few ways in which the model parses the visual input into tokens and several regimes of masking parts of the input in self-supervised training. In experimental assessment, the models not only outperform state-of-the-art methods but also provide interesting insights and partial explanations about the inference. The design of the method also makes it immune to biases that are known to be present in some RPM benchmarks.

AINov 17, 2017Code
Learning to Play Othello with Deep Neural Networks

Paweł Liskowski, Wojciech Jaśkowski, Krzysztof Krawiec

Achieving superhuman playing level by AlphaGo corroborated the capabilities of convolutional neural architectures (CNNs) for capturing complex spatial patterns. This result was to a great extent due to several analogies between Go board states and 2D images CNNs have been designed for, in particular translational invariance and a relatively large board. In this paper, we verify whether CNN-based move predictors prove effective for Othello, a game with significantly different characteristics, including a much smaller board size and complete lack of translational invariance. We compare several CNN architectures and board encodings, augment them with state-of-the-art extensions, train on an extensive database of experts' moves, and examine them with respect to move prediction accuracy and playing strength. The empirical evaluation confirms high capabilities of neural move predictors and suggests a strong correlation between prediction accuracy and playing strength. The best CNNs not only surpass all other 1-ply Othello players proposed to date but defeat (2-ply) Edax, the best open-source Othello player.

32.3CVMay 9
Improving Generative Adversarial Networks with Self-Distillation

Antoni Nowinowski, Krzysztof Krawiec

In modern GANs, maintaining an Exponential Moving Average (EMA) of the generator's weights is a standard practice, as such an averaged model consistently outperforms the actively trained generator. However, the EMA generator is used for final deployment only and does not influence the training process. To address this missed opportunity, we introduce Self-Distilled GAN (SD-GAN) that employs the EMA generator as a teacher to guide the active generator (student) via perceptual loss. We prove the local asymptotic stability of SD-GAN in the Dirac-GAN setting and show that it dampens the parasitic cycling behavior that plagues the conventional GANs. Empirical evaluations across established architectures and datasets demonstrate that SD-GAN improves the final image quality on several metrics (FID and random-FID in particular), stabilizes the optimization trajectory and provides additional learning guidance that is not trivially correlated with the conventional adversarial loss. It also proves effective for fine-tuning pretrained GAN models.

CVSep 15, 2024
Disentangling Visual Priors: Unsupervised Learning of Scene Interpretations with Compositional Autoencoder

Krzysztof Krawiec, Antoni Nowinowski

Contemporary deep learning architectures lack principled means for capturing and handling fundamental visual concepts, like objects, shapes, geometric transforms, and other higher-level structures. We propose a neurosymbolic architecture that uses a domain-specific language to capture selected priors of image formation, including object shape, appearance, categorization, and geometric transforms. We express template programs in that language and learn their parameterization with features extracted from the scene by a convolutional neural network. When executed, the parameterized program produces geometric primitives which are rendered and assessed for correspondence with the scene content and trained via auto-association with gradient. We confront our approach with a baseline method on a synthetic benchmark and demonstrate its capacity to disentangle selected aspects of the image formation process, learn from small data, correct inference in the presence of noise, and out-of-sample generalization.

LGFeb 6, 2025
Learning Semantics-aware Search Operators for Genetic Programming

Piotr Wyrwiński, Krzysztof Krawiec

Fitness landscapes in test-based program synthesis are known to be extremely rugged, with even minimal modifications of programs often leading to fundamental changes in their behavior and, consequently, fitness values. Relying on fitness as the only guidance in iterative search algorithms like genetic programming is thus unnecessarily limiting, especially when combined with purely syntactic search operators that are agnostic about their impact on program behavior. In this study, we propose a semantics-aware search operator that steers the search towards candidate programs that are valuable not only actually (high fitness) but also only potentially, i.e. are likely to be turned into high-quality solutions even if their current fitness is low. The key component of the method is a graph neural network that learns to model the interactions between program instructions and processed data, and produces a saliency map over graph nodes that represents possible search decisions. When applied to a suite of symbolic regression benchmarks, the proposed method outperforms conventional tree-based genetic programming and the ablated variant of the method.

CVNov 22, 2025
Modeling Retinal Ganglion Cells with Neural Differential Equations

Kacper Dobek, Daniel Jankowski, Krzysztof Krawiec

This work explores Liquid Time-Constant Networks (LTCs) and Closed-form Continuous-time Networks (CfCs) for modeling retinal ganglion cell activity in tiger salamanders across three datasets. Compared to a convolutional baseline and an LSTM, both architectures achieved lower MAE, faster convergence, smaller model sizes, and favorable query times, though with slightly lower Pearson correlation. Their efficiency and adaptability make them well suited for scenarios with limited data and frequent retraining, such as edge deployments in vision prosthetics.

LGAug 29, 2025
Physics-Informed Spectral Modeling for Hyperspectral Imaging

Zuzanna Gawrysiak, Krzysztof Krawiec

We present PhISM, a physics-informed deep learning architecture that learns without supervision to explicitly disentangle hyperspectral observations and model them with continuous basis functions. \mname outperforms prior methods on several classification and regression benchmarks, requires limited labeled data, and provides additional insights thanks to interpretable latent representation.

CVJun 9, 2025
Generative Learning of Differentiable Object Models for Compositional Interpretation of Complex Scenes

Antoni Nowinowski, Krzysztof Krawiec

This study builds on the architecture of the Disentangler of Visual Priors (DVP), a type of autoencoder that learns to interpret scenes by decomposing the perceived objects into independent visual aspects of shape, size, orientation, and color appearance. These aspects are expressed as latent parameters which control a differentiable renderer that performs image reconstruction, so that the model can be trained end-to-end with gradient using reconstruction loss. In this study, we extend the original DVP so that it can handle multiple objects in a scene. We also exploit the interpretability of its latent by using the decoder to sample additional training examples and devising alternative training modes that rely on loss functions defined not only in the image space, but also in the latent space. This significantly facilitates training, which is otherwise challenging due to the presence of extensive plateaus in the image-space reconstruction loss. To examine the performance of this approach, we propose a new benchmark featuring multiple 2D objects, which subsumes the previously proposed Multi-dSprites dataset while being more parameterizable. We compare the DVP extended in these ways with two baselines (MONet and LIVE) and demonstrate its superiority in terms of reconstruction quality and capacity to decompose overlapping objects. We also analyze the gradients induced by the considered loss functions, explain how they impact the efficacy of training, and discuss the limitations of differentiable rendering in autoencoders and the ways in which they can be addressed.

CVNov 18, 2024
Autoassociative Learning of Structural Representations for Modeling and Classification in Medical Imaging

Zuzanna Buchnajzer, Kacper Dobek, Stanisław Hapke et al.

Deep learning architectures based on convolutional neural networks tend to rely on continuous, smooth features. While this characteristics provides significant robustness and proves useful in many real-world tasks, it is strikingly incompatible with the physical characteristic of the world, which, at the scale in which humans operate, comprises crisp objects, typically representing well-defined categories. This study proposes a class of neurosymbolic systems that learn by reconstructing images in terms of visual primitives and are thus forced to form high-level, structural explanations of them. When applied to the task of diagnosing abnormalities in histological imaging, the method proved superior to a conventional deep learning architecture in terms of classification accuracy, while being more transparent.

NENov 3, 2024
Guiding Genetic Programming with Graph Neural Networks

Piotr Wyrwiński, Krzysztof Krawiec

In evolutionary computation, it is commonly assumed that a search algorithm acquires knowledge about a problem instance by sampling solutions from the search space and evaluating them with a fitness function. This is necessarily inefficient because fitness reveals very little about solutions -- yet they contain more information that can be potentially exploited. To address this observation in genetic programming, we propose EvoNUDGE, which uses a graph neural network to elicit additional knowledge from symbolic regression problems. The network is queried on the problem before an evolutionary run to produce a library of subprograms, which is subsequently used to seed the initial population and bias the actions of search operators. In an extensive experiment on a large number of problem instances, EvoNUDGE is shown to significantly outperform multiple baselines, including the conventional tree-based genetic programming and the purely neural variant of the method.

LGOct 23, 2018
Ain't Nobody Got Time For Coding: Structure-Aware Program Synthesis From Natural Language

Jakub Bednarek, Karol Piaskowski, Krzysztof Krawiec

Program synthesis from natural language (NL) is practical for humans and, once technically feasible, would significantly facilitate software development and revolutionize end-user programming. We present SAPS, an end-to-end neural network capable of mapping relatively complex, multi-sentence NL specifications to snippets of executable code. The proposed architecture relies exclusively on neural components, and is trained on abstract syntax trees, combined with a pretrained word embedding and a bi-directional multi-layer LSTM for processing of word sequences. The decoder features a doubly-recurrent LSTM, for which we propose novel signal propagation schemes and soft attention mechanism. When applied to a large dataset of problems proposed in a previous study, SAPS performs on par with or better than the method proposed there, producing correct programs in over 92% of cases. In contrast to other methods, it does not require post-processing of the resulting programs, and uses a fixed-dimensional latent representation as the only interface between the NL analyzer and the source code generator.

AIJun 4, 2016
Distance Metric Ensemble Learning and the Andrews-Curtis Conjecture

Krzysztof Krawiec, Jerry Swan

Motivated by the search for a counterexample to the Poincaré conjecture in three and four dimensions, the Andrews-Curtis conjecture was proposed in 1965. It is now generally suspected that the Andrews-Curtis conjecture is false, but small potential counterexamples are not so numerous, and previous work has attempted to eliminate some via combinatorial search. Progress has however been limited, with the most successful approach (breadth-first-search using secondary storage) being neither scalable nor heuristically-informed. A previous empirical analysis of problem structure examined several heuristic measures of search progress and determined that none of them provided any useful guidance for search. In this article, we induce new quality measures directly from the problem structure and combine them to produce a more effective search driver via ensemble machine learning. By this means, we eliminate 19 potential counterexamples, the status of which had been unknown for some years.