Michal Kawulok

CV
h-index27
19papers
436citations
Novelty40%
AI Score33

19 Papers

IVOct 6, 2022
MuS2: A Real-World Benchmark for Sentinel-2 Multi-Image Super-Resolution

Pawel Kowaleczko, Tomasz Tarasiewicz, Maciej Ziaja et al.

Insufficient image spatial resolution is a serious limitation in many practical scenarios, especially when acquiring images at a finer scale is infeasible or brings higher costs. This is inherent to remote sensing, including Sentinel-2 satellite images that are available free of charge at a high revisit frequency, but whose spatial resolution is limited to 10 m ground sampling distance. The resolution can be increased with super-resolution algorithms, in particular when performed from multiple images captured at subsequent revisits of a satellite, taking advantage of information fusion that leads to enhanced reconstruction accuracy. One of the obstacles in multi-image super-resolution consists in the scarcity of real-world benchmarks - commonly, simulated data are exploited which do not fully reflect the operating conditions. In this paper, we introduce a new MuS2 benchmark for super-resolving multiple Sentinel-2 images, with WorldView-2 imagery used as the high-resolution reference. Within MuS2, we publish the first end-to-end evaluation procedure for this problem which we expect to help the researchers in advancing the state of the art in multi-image super-resolution.

CVJan 26, 2023
Multitemporal and multispectral data fusion for super-resolution of Sentinel-2 images

Tomasz Tarasiewicz, Jakub Nalepa, Reuben A. Farrugia et al.

Multispectral Sentinel-2 images are a valuable source of Earth observation data, however spatial resolution of their spectral bands limited to 10 m, 20 m, and 60 m ground sampling distance remains insufficient in many cases. This problem can be addressed with super-resolution, aimed at reconstructing a high-resolution image from a low-resolution observation. For Sentinel-2, spectral information fusion allows for enhancing the 20 m and 60 m bands to the 10 m resolution. Also, there were attempts to combine multitemporal stacks of individual Sentinel-2 bands, however these two approaches have not been combined so far. In this paper, we introduce DeepSent -- a new deep network for super-resolving multitemporal series of multispectral Sentinel-2 images. It is underpinned with information fusion performed simultaneously in the spectral and temporal dimensions to generate an enlarged multispectral image. In our extensive experimental study, we demonstrate that our solution outperforms other state-of-the-art techniques that realize either multitemporal or multispectral data fusion. Furthermore, we show that the advantage of DeepSent results from how these two fusion types are combined in a single architecture, which is superior to performing such fusion in a sequential manner. Importantly, we have applied our method to super-resolve real-world Sentinel-2 images, enhancing the spatial resolution of all the spectral bands to 3.3 m nominal ground sampling distance, and we compare the outcome with very high-resolution WorldView-2 images. We will publish our implementation upon paper acceptance, and we expect it will increase the possibilities of exploiting super-resolved Sentinel-2 images in real-life applications.

CVAug 3, 2022
A Multibranch Convolutional Neural Network for Hyperspectral Unmixing

Lukasz Tulczyjew, Michal Kawulok, Nicolas Longépé et al.

Hyperspectral unmixing remains one of the most challenging tasks in the analysis of such data. Deep learning has been blooming in the field and proved to outperform other classic unmixing techniques, and can be effectively deployed onboard Earth observation satellites equipped with hyperspectral imagers. In this letter, we follow this research pathway and propose a multi-branch convolutional neural network that benefits from fusing spectral, spatial, and spectral-spatial features in the unmixing process. The results of our experiments, backed up with the ablation study, revealed that our techniques outperform others from the literature and lead to higher-quality fractional abundance estimation. Also, we investigated the influence of reducing the training sets on the capabilities of all algorithms and their robustness against noise, as capturing large and representative ground-truth sets is time-consuming and costly in practice, especially in emerging Earth observation scenarios.

CVAug 3, 2022
Graph Neural Networks Extract High-Resolution Cultivated Land Maps from Sentinel-2 Image Series

Lukasz Tulczyjew, Michal Kawulok, Nicolas Longépé et al.

Maintaining farm sustainability through optimizing the agricultural management practices helps build more planet-friendly environment. The emerging satellite missions can acquire multi- and hyperspectral imagery which captures more detailed spectral information concerning the scanned area, hence allows us to benefit from subtle spectral features during the analysis process in agricultural applications. We introduce an approach for extracting 2.5 m cultivated land maps from 10 m Sentinel-2 multispectral image series which benefits from a compact graph convolutional neural network. The experiments indicate that our models not only outperform classical and deep machine learning techniques through delivering higher-quality segmentation maps, but also dramatically reduce the memory footprint when compared to U-Nets (almost 8k trainable parameters of our models, with up to 31M parameters of U-Nets). Such memory frugality is pivotal in the missions which allow us to uplink a model to the AI-powered satellite once it is in orbit, as sending large nets is impossible due to the time constraints.

CVJun 16, 2023
Squeezing nnU-Nets with Knowledge Distillation for On-Board Cloud Detection

Bartosz Grabowski, Maciej Ziaja, Michal Kawulok et al.

Cloud detection is a pivotal satellite image pre-processing step that can be performed both on the ground and on board a satellite to tag useful images. In the latter case, it can reduce the amount of data to downlink by pruning the cloudy areas, or to make a satellite more autonomous through data-driven acquisition re-scheduling. We approach this task with nnU-Nets, a self-reconfigurable framework able to perform meta-learning of a segmentation network over various datasets. Unfortunately, such models are commonly memory-inefficient due to their (very) large architectures. To benefit from them in on-board processing, we compress nnU-Nets with knowledge distillation into much smaller and compact U-Nets. Our experiments, performed over Sentinel-2 and Landsat-8 images revealed that nnU-Nets deliver state-of-the-art performance without any manual design. Our approach was ranked within the top 7% best solutions (across 847 teams) in the On Cloud N: Cloud Cover Detection Challenge, where we reached the Jaccard index of 0.882 over more than 10k unseen Sentinel-2 images (the winners obtained 0.897, the baseline U-Net with the ResNet-34 backbone: 0.817, and the classic Sentinel-2 image thresholding: 0.652). Finally, we showed that knowledge distillation enables to elaborate dramatically smaller (almost 280x) U-Nets when compared to nnU-Nets while still maintaining their segmentation capabilities.

CVOct 24, 2022
Self-Configuring nnU-Nets Detect Clouds in Satellite Images

Bartosz Grabowski, Maciej Ziaja, Michal Kawulok et al.

Cloud detection is a pivotal satellite image pre-processing step that can be performed both on the ground and on board a satellite to tag useful images. In the latter case, it can help to reduce the amount of data to downlink by pruning the cloudy areas, or to make a satellite more autonomous through data-driven acquisition re-scheduling of the cloudy areas. We approach this important task with nnU-Nets, a self-reconfigurable framework able to perform meta-learning of a segmentation network over various datasets. Our experiments, performed over Sentinel-2 and Landsat-8 multispectral images revealed that nnU-Nets deliver state-of-the-art cloud segmentation performance without any manual design. Our approach was ranked within the top 7% best solutions (across 847 participating teams) in the On Cloud N: Cloud Cover Detection Challenge, where we reached the Jaccard index of 0.882 over more than 10k unseen Sentinel-2 image patches (the winners obtained 0.897, whereas the baseline U-Net with the ResNet-34 backbone used as an encoder: 0.817, and the classic Sentinel-2 image thresholding: 0.652).

CVJul 12, 2024
Task-driven single-image super-resolution reconstruction of document scans

Maciej Zyrek, Michal Kawulok

Super-resolution reconstruction is aimed at generating images of high spatial resolution from low-resolution observations. State-of-the-art super-resolution techniques underpinned with deep learning allow for obtaining results of outstanding visual quality, but it is seldom verified whether they constitute a valuable source for specific computer vision applications. In this paper, we investigate the possibility of employing super-resolution as a preprocessing step to improve optical character recognition from document scans. To achieve that, we propose to train deep networks for single-image super-resolution in a task-driven way to make them better adapted for the purpose of text detection. As problems limited to a specific task are heavily ill-posed, we introduce a multi-task loss function that embraces components related with text detection coupled with those guided by image similarity. The obtained results reported in this paper are encouraging and they constitute an important step towards real-world super-resolution of document images.

CVMar 19, 2025
Toward task-driven satellite image super-resolution

Maciej Ziaja, Pawel Kowaleczko, Daniel Kostrzewa et al.

Super-resolution is aimed at reconstructing high-resolution images from low-resolution observations. State-of-the-art approaches underpinned with deep learning allow for obtaining outstanding results, generating images of high perceptual quality. However, it often remains unclear whether the reconstructed details are close to the actual ground-truth information and whether they constitute a more valuable source for image analysis algorithms. In the reported work, we address the latter problem, and we present our efforts toward learning super-resolution algorithms in a task-driven way to make them suitable for generating high-resolution images that can be exploited for automated image analysis. In the reported initial research, we propose a methodological approach for assessing the existing models that perform computer vision tasks in terms of whether they can be used for evaluating super-resolution reconstruction algorithms, as well as training them in a task-driven way. We support our analysis with experimental study and we expect it to establish a solid foundation for selecting appropriate computer vision tasks that will advance the capabilities of real-world super-resolution.

CVJun 8, 2025
Task-driven real-world super-resolution of document scans

Maciej Zyrek, Tomasz Tarasiewicz, Jakub Sadel et al.

Single-image super-resolution refers to the reconstruction of a high-resolution image from a single low-resolution observation. Although recent deep learning-based methods have demonstrated notable success on simulated datasets -- with low-resolution images obtained by degrading and downsampling high-resolution ones -- they frequently fail to generalize to real-world settings, such as document scans, which are affected by complex degradations and semantic variability. In this study, we introduce a task-driven, multi-task learning framework for training a super-resolution network specifically optimized for optical character recognition tasks. We propose to incorporate auxiliary loss functions derived from high-level vision tasks, including text detection using the connectionist text proposal network, text recognition via a convolutional recurrent neural network, keypoints localization using Key.Net, and hue consistency. To balance these diverse objectives, we employ dynamic weight averaging mechanism, which adaptively adjusts the relative importance of each loss term based on its convergence behavior. We validate our approach upon the SRResNet architecture, which is a well-established technique for single-image super-resolution. Experimental evaluations on both simulated and real-world scanned document datasets demonstrate that the proposed approach improves text detection, measured with intersection over union, while preserving overall image fidelity. These findings underscore the value of multi-objective optimization in super-resolution models for bridging the gap between simulated training regimes and practical deployment in real-world scenarios.

CVMar 20, 2025
Coupling deep and handcrafted features to assess smile genuineness

Benedykt Pawlus, Bogdan Smolka, Jolanta Kawulok et al.

Assessing smile genuineness from video sequences is a vital topic concerned with recognizing facial expression and linking them with the underlying emotional states. There have been a number of techniques proposed underpinned with handcrafted features, as well as those that rely on deep learning to elaborate the useful features. As both of these approaches have certain benefits and limitations, in this work we propose to combine the features learned by a long short-term memory network with the features handcrafted to capture the dynamics of facial action units. The results of our experiments indicate that the proposed solution is more effective than the baseline techniques and it allows for assessing the smile genuineness from video sequences in real-time.

CVJul 27, 2024
Ensembling convolutional neural networks for human skin segmentation

Patryk Kuban, Michal Kawulok

Detecting and segmenting human skin regions in digital images is an intensively explored topic of computer vision with a variety of approaches proposed over the years that have been found useful in numerous practical applications. The first methods were based on pixel-wise skin color modeling and they were later enhanced with context-based analysis to include the textural and geometrical features, recently extracted using deep convolutional neural networks. It has been also demonstrated that skin regions can be segmented from grayscale images without using color information at all. However, the possibility to combine these two sources of information has not been explored so far and we address this research gap with the contribution reported in this paper. We propose to train a convolutional network using the datasets focused on different features to create an ensemble whose individual outcomes are effectively combined using yet another convolutional network trained to produce the final segmentation map. The experimental results clearly indicate that the proposed approach outperforms the basic classifiers, as well as an ensemble based on the voting scheme. We expect that this study will help in developing new ensemble-based techniques that will improve the performance of semantic segmentation systems, reaching beyond the problem of detecting human skin.

CVJul 27, 2019
Segmenting Hyperspectral Images Using Spectral-Spatial Convolutional Neural Networks With Training-Time Data Augmentation

Jakub Nalepa, Lukasz Tulczyjew, Michal Myller et al.

Hyperspectral imaging provides detailed information about the scanned objects, as it captures their spectral characteristics within a large number of wavelength bands. Classification of such data has become an active research topic due to its wide applicability in a variety of fields. Deep learning has established the state of the art in the area, and it constitutes the current research mainstream. In this letter, we introduce a new spectral-spatial convolutional neural network, benefitting from a battery of data augmentation techniques which help deal with a real-life problem of lacking ground-truth training data. Our rigorous experiments showed that the proposed method outperforms other spectral-spatial techniques from the literature, and delivers precise hyperspectral classification in real time.

IVJul 18, 2019
Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors

Jakub Nalepa, Pablo Ribalta Lorenzo, Michal Marcinkiewicz et al.

Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) plays an important role in diagnosis and grading of brain tumor. Although manual DCE biomarker extraction algorithms boost the diagnostic yield of DCE-MRI by providing quantitative information on tumor prognosis and prediction, they are time-consuming and prone to human error. In this paper, we propose a fully-automated, end-to-end system for DCE-MRI analysis of brain tumors. Our deep learning-powered technique does not require any user interaction, it yields reproducible results, and it is rigorously validated against benchmark (BraTS'17 for tumor segmentation, and a test dataset released by the Quantitative Imaging Biomarkers Alliance for the contrast-concentration fitting) and clinical (44 low-grade glioma patients) data. Also, we introduce a cubic model of the vascular input function used for pharmacokinetic modeling which significantly decreases the fitting error when compared with the state of the art, alongside a real-time algorithm for determination of the vascular input region. An extensive experimental study, backed up with statistical tests, showed that our system delivers state-of-the-art results (in terms of segmentation accuracy and contrast-concentration fitting) while requiring less than 3 minutes to process an entire input DCE-MRI study using a single GPU.

CVJun 23, 2019
Transfer Learning for Segmenting Dimensionally-Reduced Hyperspectral Images

Jakub Nalepa, Michal Myller, Michal Kawulok

Deep learning has established the state of the art in multiple fields, including hyperspectral image analysis. However, training large-capacity learners to segment such imagery requires representative training sets. Acquiring such data is human-dependent and time-consuming, especially in Earth observation scenarios, where the hyperspectral data transfer is very costly and time-constrained. In this letter, we show how to effectively deal with a limited number and size of available hyperspectral ground-truth sets, and apply transfer learning for building deep feature extractors. Also, we exploit spectral dimensionality reduction to make our technique applicable over hyperspectral data acquired using different sensors, which may capture different numbers of hyperspectral bands. The experiments, performed over several benchmarks and backed up with statistical tests, indicated that our approach allows us to effectively train well-generalizing deep convolutional neural nets even using significantly reduced data.

CVJun 16, 2019
On training deep networks for satellite image super-resolution

Michal Kawulok, Szymon Piechaczek, Krzysztof Hrynczenko et al.

The capabilities of super-resolution reconstruction (SRR)---techniques for enhancing image spatial resolution---have been recently improved significantly by the use of deep convolutional neural networks. Commonly, such networks are learned using huge training sets composed of original images alongside their low-resolution counterparts, obtained with bicubic downsampling. In this paper, we investigate how the SRR performance is influenced by the way such low-resolution training data are obtained, which has not been explored up to date. Our extensive experimental study indicates that the training data characteristics have a large impact on the reconstruction accuracy, and the widely-adopted approach is not the most effective for dealing with satellite images. Overall, we argue that developing better training data preparation routines may be pivotal in making SRR suitable for real-world applications.

CVMar 13, 2019
Hyperspectral Data Augmentation

Jakub Nalepa, Michal Myller, Michal Kawulok

Data augmentation is a popular technique which helps improve generalization capabilities of deep neural networks. It plays a pivotal role in remote-sensing scenarios in which the amount of high-quality ground truth data is limited, and acquiring new examples is costly or impossible. This is a common problem in hyperspectral imaging, where manual annotation of image data is difficult, expensive, and prone to human bias. In this letter, we propose online data augmentation of hyperspectral data which is executed during the inference rather than before the training of deep networks. This is in contrast to all other state-of-the-art hyperspectral augmentation algorithms which increase the size (and representativeness) of training sets. Additionally, we introduce a new principal component analysis based augmentation. The experiments revealed that our data augmentation algorithms improve generalization of deep networks, work in real-time, and the online approach can be effectively combined with offline techniques to enhance the classification accuracy.

CVMar 1, 2019
Deep Learning for Multiple-Image Super-Resolution

Michal Kawulok, Pawel Benecki, Szymon Piechaczek et al.

Super-resolution reconstruction (SRR) is a process aimed at enhancing spatial resolution of images, either from a single observation, based on the learned relation between low and high resolution, or from multiple images presenting the same scene. SRR is particularly valuable, if it is infeasible to acquire images at desired resolution, but many images of the same scene are available at lower resolution---this is inherent to a variety of remote sensing scenarios. Recently, we have witnessed substantial improvement in single-image SRR attributed to the use of deep neural networks for learning the relation between low and high resolution. Importantly, deep learning has not been exploited for multiple-image SRR, which benefits from information fusion and in general allows for achieving higher reconstruction accuracy. In this letter, we introduce a new method which combines the advantages of multiple-image fusion with learning the low-to-high resolution mapping using deep networks. The reported experimental results indicate that our algorithm outperforms the state-of-the-art SRR methods, including these that operate from a single image, as well as those that perform multiple-image fusion.

CVNov 8, 2018
Validating Hyperspectral Image Segmentation

Jakub Nalepa, Michal Myller, Michal Kawulok

Hyperspectral satellite imaging attracts enormous research attention in the remote sensing community, hence automated approaches for precise segmentation of such imagery are being rapidly developed. In this letter, we share our observations on the strategy for validating hyperspectral image segmentation algorithms currently followed in the literature, and show that it can lead to over-optimistic experimental insights. We introduce a new routine for generating segmentation benchmarks, and use it to elaborate ready-to-use hyperspectral training-test data partitions. They can be utilized for fair validation of new and existing algorithms without any training-test data leakage.

CVFeb 11, 2014
Real-Time Hand Shape Classification

Jakub Nalepa, Michal Kawulok

The problem of hand shape classification is challenging since a hand is characterized by a large number of degrees of freedom. Numerous shape descriptors have been proposed and applied over the years to estimate and classify hand poses in reasonable time. In this paper we discuss our parallel framework for real-time hand shape classification applicable in real-time applications. We show how the number of gallery images influences the classification accuracy and execution time of the parallel algorithm. We present the speedup and efficiency analyses that prove the efficacy of the parallel implementation. Noteworthy, different methods can be used at each step of our parallel framework. Here, we combine the shape contexts with the appearance-based techniques to enhance the robustness of the algorithm and to increase the classification score. An extensive experimental study proves the superiority of the proposed approach over existing state-of-the-art methods.