Michal Byra

CV
h-index29
17papers
72citations
Novelty41%
AI Score45

17 Papers

CVAug 8, 2023
Implicit neural representations for joint decomposition and registration of gene expression images in the marmoset brain

Michal Byra, Charissa Poon, Tomomi Shimogori et al.

We propose a novel image registration method based on implicit neural representations that addresses the challenging problem of registering a pair of brain images with similar anatomical structures, but where one image contains additional features or artifacts that are not present in the other image. To demonstrate its effectiveness, we use 2D microscopy $\textit{in situ}$ hybridization gene expression images of the marmoset brain. Accurately quantifying gene expression requires image registration to a brain template, which is difficult due to the diversity of patterns causing variations in visible anatomical brain structures. Our approach uses implicit networks in combination with an image exclusion loss to jointly perform the registration and decompose the image into a support and residual image. The support image aligns well with the template, while the residual image captures individual image characteristics that diverge from the template. In experiments, our method provided excellent results and outperformed other registration techniques.

CVAug 8, 2023
Few-shot medical image classification with simple shape and texture text descriptors using vision-language models

Michal Byra, Muhammad Febrian Rachmadi, Henrik Skibbe

In this work, we investigate the usefulness of vision-language models (VLMs) and large language models for binary few-shot classification of medical images. We utilize the GPT-4 model to generate text descriptors that encapsulate the shape and texture characteristics of objects in medical images. Subsequently, these GPT-4 generated descriptors, alongside VLMs pre-trained on natural images, are employed to classify chest X-rays and breast ultrasound images. Our results indicate that few-shot classification of medical images using VLMs and GPT-4 generated descriptors is a viable approach. However, accurate classification requires to exclude certain descriptors from the calculations of the classification scores. Moreover, we assess the ability of VLMs to evaluate shape features in breast mass ultrasound images. We further investigate the degree of variability among the sets of text descriptors produced by GPT-4. Our work provides several important insights about the application of VLMs for medical image analysis.

CVMar 13, 2023
An automated pipeline to create an atlas of in situ hybridization gene expression data in the adult marmoset brain

Charissa Poon, Muhammad Febrian Rachmadi, Michal Byra et al.

We present the first automated pipeline to create an atlas of in situ hybridization gene expression in the adult marmoset brain in the same stereotaxic space. The pipeline consists of segmentation of gene expression from microscopy images and registration of images to a standard space. Automation of this pipeline is necessary to analyze the large volume of data in the genome-wide whole-brain dataset, and to process images that have varying intensity profiles and expression patterns with minimal human bias. To reduce the number of labelled images required for training, we develop a semi-supervised segmentation model. We further develop an iterative algorithm to register images to a standard space, enabling comparative analysis between genes and concurrent visualization with other datasets, thereby facilitating a more holistic understanding of primate brain structure and function.

LGSep 21, 2024
Implicit Neural Representations for Speed-of-Sound Estimation in Ultrasound

Michal Byra, Piotr Jarosik, Piotr Karwat et al.

Accurate estimation of the speed-of-sound (SoS) is important for ultrasound (US) image reconstruction techniques and tissue characterization. Various approaches have been proposed to calculate SoS, ranging from tomography-inspired algorithms like CUTE to convolutional networks, and more recently, physics-informed optimization frameworks based on differentiable beamforming. In this work, we utilize implicit neural representations (INRs) for SoS estimation in US. INRs are a type of neural network architecture that encodes continuous functions, such as images or physical quantities, through the weights of a network. Implicit networks may overcome the current limitations of SoS estimation techniques, which mainly arise from the use of non-adaptable and oversimplified physical models of tissue. Moreover, convolutional networks for SoS estimation, usually trained using simulated data, often fail when applied to real tissues due to out-of-distribution and data-shift issues. In contrast, implicit networks do not require extensive training datasets since each implicit network is optimized for an individual data case. This adaptability makes them suitable for processing US data collected from varied tissues and across different imaging protocols. We evaluated the proposed SoS estimation method based on INRs using data collected from a tissue-mimicking phantom containing four cylindrical inclusions, with SoS values ranging from 1480 m/s to 1600 m/s. The inclusions were immersed in a material with an SoS value of 1540 m/s. In experiments, the proposed method achieved strong performance, clearly demonstrating the usefulness of implicit networks for quantitative US applications.

IVNov 3, 2022
Deep meta-learning for the selection of accurate ultrasound based breast mass classifier

Michal Byra, Piotr Karwat, Ivan Ryzhankow et al.

Standard classification methods based on handcrafted morphological and texture features have achieved good performance in breast mass differentiation in ultrasound (US). In comparison to deep neural networks, commonly perceived as "black-box" models, classical techniques are based on features that have well-understood medical and physical interpretation. However, classifiers based on morphological features commonly underperform in the presence of the shadowing artifact and ill-defined mass borders, while texture based classifiers may fail when the US image is too noisy. Therefore, in practice it would be beneficial to select the classification method based on the appearance of the particular US image. In this work, we develop a deep meta-network that can automatically process input breast mass US images and recommend whether to apply the shape or texture based classifier for the breast mass differentiation. Our preliminary results demonstrate that meta-learning techniques can be used to improve the performance of the standard classifiers based on handcrafted features. With the proposed meta-learning based approach, we achieved the area under the receiver operating characteristic curve of 0.95 and accuracy of 0.91.

MED-PHMay 19, 2022
Estimating the ultrasound attenuation coefficient using convolutional neural networks -- a feasibility study

Piotr Jarosik, Michal Byra, Marcin Lewandowski et al.

Attenuation coefficient (AC) is a fundamental measure of tissue acoustical properties, which can be used in medical diagnostics. In this work, we investigate the feasibility of using convolutional neural networks (CNNs) to directly estimate AC from radio-frequency (RF) ultrasound signals. To develop the CNNs we used RF signals collected from tissue mimicking numerical phantoms for the AC values in a range from 0.1 to 1.5 dB/(MHz*cm). The models were trained based on 1-D patches of RF data. We obtained mean absolute AC estimation errors of 0.08, 0.12, 0.20, 0.25 for the patch lengths: 10 mm, 5 mm, 2 mm and 1 mm, respectively. We explain the performance of the model by visualizing the frequency content associated with convolutional filters. Our study presents that the AC can be calculated using deep learning, and the weights of the CNNs can have physical interpretation.

LGOct 11, 2022
Parameter estimation of the homodyned K distribution based on neural networks and trainable fractional-order moments

Michal Byra, Ziemowit Klimonda, Piotr Jarosik

Homodyned K (HK) distribution has been widely used to describe the scattering phenomena arising in various research fields, such as ultrasound imaging or optics. In this work, we propose a machine learning based approach to the estimation of the HK distribution parameters. We develop neural networks that can estimate the HK distribution parameters based on the signal-to-noise ratio, skewness and kurtosis calculated using fractional-order moments. Compared to the previous approaches, we consider the orders of the moments as trainable variables that can be optimized along with the network weights using the back-propagation algorithm. Networks are trained based on samples generated from the HK distribution. Obtained results demonstrate that the proposed method can be used to accurately estimate the HK distribution parameters.

LGMay 17
t-gems: text-guided exit modules for decreasing clip image encoder

Alberto Presta, Grzegorz Stefanski, Michal Byra et al.

Multimodal deep neural networks enhance deep comprehension by integrating diverse data modalities. Data from different modalities are typically projected into a shared latent space for similarity computation, but this process is resource intensive due to large image encoders and equal processing of test data during prediction. Early exit methods reduce computational load by utilizing intermediate layers, saving time and memory. However, developing such methods is challenging for multimodal data like image-text pairs. This study investigates the semantic content distributions present in intermediate layers of encoders such as CLIP, which can be derived from textual descriptions. We introduce Text-Guided Exit Modules (T-GEMs) and a rate-based regularizer to control encoder usage costs while maintaining cross-modal understanding performance.

CVMay 11
bViT: Investigating Single-Block Recurrence in Vision Transformers for Image Recognition

Michal Byra, Pawel Olszowiec, Grzegorz Stefanski et al.

Vision Transformers (ViTs) are built by stacking independently parameterized blocks, but it remains unclear how much of this depth requires layer specific transformations and how much can be realized through recurrent computation. We study this question with bViT, a single-block recurrent ViT in which one transformer block is applied repeatedly to process an image. This architecture preserves the iterative structure of a deep ViT while removing layer specific block parameterization, providing a controlled setting for studying recurrence in vision. On ImageNet-1K, a 12-step bViT-B achieves accuracy comparable to standard ViT-B under the same training recipe and computational budget, while using an order of magnitude fewer parameters. We observe that recurrent performance improves with representation width, with wider bViTs recovering much more of the performance of standard ViTs than narrow variants. We interpret this behavior as implicit depth multiplexing, where a shared block expresses multiple step-dependent computations through the evolving hidden state. Beyond ImageNet classification, bViT transfers competitively to downstream tasks and enables parameter-efficient fine-tuning. Mechanistic analyses of activations, attention and step-specific pruning show that the shared block changes its effective behavior across recurrent steps rather than simply repeating the same computation. Our results suggest that a large fraction of ViT depth can be implemented through recurrent reuse, provided that the representation space is sufficiently wide.

AIJan 29
Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data

Grzegorz Stefanski, Alberto Presta, Michal Byra

In pruning, the Lottery Ticket Hypothesis posits that large networks contain sparse subnetworks, or winning tickets, that can be trained in isolation to match the performance of their dense counterparts. However, most existing approaches assume a single universal winning ticket shared across all inputs, ignoring the inherent heterogeneity of real-world data. In this work, we propose Routing the Lottery (RTL), an adaptive pruning framework that discovers multiple specialized subnetworks, called adaptive tickets, each tailored to a class, semantic cluster, or environmental condition. Across diverse datasets and tasks, RTL consistently outperforms single- and multi-model baselines in balanced accuracy and recall, while using up to 10 times fewer parameters than independent models and exhibiting semantically aligned. Furthermore, we identify subnetwork collapse, a performance drop under aggressive pruning, and introduce a subnetwork similarity score that enables label-free diagnosis of oversparsification. Overall, our results recast pruning as a mechanism for aligning model structure with data heterogeneity, paving the way toward more modular and context-aware deep learning.

CVJan 20, 2025
Generating visual explanations from deep networks using implicit neural representations

Michal Byra, Henrik Skibbe

Explaining deep learning models in a way that humans can easily understand is essential for responsible artificial intelligence applications. Attribution methods constitute an important area of explainable deep learning. The attribution problem involves finding parts of the network's input that are the most responsible for the model's output. In this work, we demonstrate that implicit neural representations (INRs) constitute a good framework for generating visual explanations. Firstly, we utilize coordinate-based implicit networks to reformulate and extend the extremal perturbations technique and generate attribution masks. Experimental results confirm the usefulness of our method. For instance, by proper conditioning of the implicit network, we obtain attribution masks that are well-behaved with respect to the imposed area constraints. Secondly, we present an iterative INR-based method that can be used to generate multiple non-overlapping attribution masks for the same image. We depict that a deep learning model may associate the image label with both the appearance of the object of interest as well as with areas and textures usually accompanying the object. Our study demonstrates that implicit networks are well-suited for the generation of attribution masks and can provide interesting insights about the performance of deep learning models.

CVDec 12, 2023
PatchMorph: A Stochastic Deep Learning Approach for Unsupervised 3D Brain Image Registration with Small Patches

Henrik Skibbe, Michal Byra, Akiya Watakabe et al.

We introduce "PatchMorph," an new stochastic deep learning algorithm tailored for unsupervised 3D brain image registration. Unlike other methods, our method uses compact patches of a constant small size to derive solutions that can combine global transformations with local deformations. This approach minimizes the memory footprint of the GPU during training, but also enables us to operate on numerous amounts of randomly overlapping small patches during inference to mitigate image and patch boundary problems. PatchMorph adeptly handles world coordinate transformations between two input images, accommodating variances in attributes such as spacing, array sizes, and orientations. The spatial resolution of patches transitions from coarse to fine, addressing both global and local attributes essential for aligning the images. Each patch offers a unique perspective, together converging towards a comprehensive solution. Experiments on human T1 MRI brain images and marmoset brain images from serial 2-photon tomography affirm PatchMorph's superior performance.

IVSep 7, 2020
Adversarial attacks on deep learning models for fatty liver disease classification by modification of ultrasound image reconstruction method

Michal Byra, Grzegorz Styczynski, Cezary Szmigielski et al.

Convolutional neural networks (CNNs) have achieved remarkable success in medical image analysis tasks. In ultrasound (US) imaging, CNNs have been applied to object classification, image reconstruction and tissue characterization. However, CNNs can be vulnerable to adversarial attacks, even small perturbations applied to input data may significantly affect model performance and result in wrong output. In this work, we devise a novel adversarial attack, specific to ultrasound (US) imaging. US images are reconstructed based on radio-frequency signals. Since the appearance of US images depends on the applied image reconstruction method, we explore the possibility of fooling deep learning model by perturbing US B-mode image reconstruction method. We apply zeroth order optimization to find small perturbations of image reconstruction parameters, related to attenuation compensation and amplitude compression, which can result in wrong output. We illustrate our approach using a deep learning model developed for fatty liver disease diagnosis, where the proposed adversarial attack achieved success rate of 48%.

IVJan 27, 2020
Breast mass segmentation based on ultrasonic entropy maps and attention gated U-Net

Michal Byra, Piotr Jarosik, Katarzyna Dobruch-Sobczak et al.

We propose a novel deep learning based approach to breast mass segmentation in ultrasound (US) imaging. In comparison to commonly applied segmentation methods, which use US images, our approach is based on quantitative entropy parametric maps. To segment the breast masses we utilized an attention gated U-Net convolutional neural network. US images and entropy maps were generated based on raw US signals collected from 269 breast masses. The segmentation networks were developed separately using US image and entropy maps, and evaluated on a test set of 81 breast masses. The attention U-Net trained based on entropy maps achieved average Dice score of 0.60 (median 0.71), while for the model trained using US images we obtained average Dice score of 0.53 (median 0.59). Our work presents the feasibility of using quantitative US parametric maps for the breast mass segmentation. The obtained results suggest that US parametric maps, which provide the information about local tissue scattering properties, might be more suitable for the development of breast mass segmentation methods than regular US images.

IVAug 5, 2019
Knee menisci segmentation and relaxometry of 3D ultrashort echo time (UTE) cones MR imaging using attention U-Net with transfer learning

Michal Byra, Mei Wu, Xiaodong Zhang et al.

The purpose of this work is to develop a deep learning-based method for knee menisci segmentation in 3D ultrashort echo time (UTE) cones magnetic resonance (MR) imaging, and to automatically determine MR relaxation times, namely the T1, T1$ρ$, and T2* parameters, which can be used to assess knee osteoarthritis (OA). Whole knee joint imaging was performed using 3D UTE cones sequences to collect data from 61 human subjects. Regions of interest (ROIs) were outlined by two experienced radiologists based on subtracted T1$ρ$-weighted MR images. Transfer learning was applied to develop 2D attention U-Net convolutional neural networks for the menisci segmentation based on each radiologist's ROIs separately. Dice scores were calculated to assess segmentation performance. Next, the T1, T1$ρ$, T2* relaxations, and ROI areas were determined for the manual and automatic segmentations, then compared.The models developed using ROIs provided by two radiologists achieved high Dice scores of 0.860 and 0.833, while the radiologists' manual segmentations achieved a Dice score of 0.820. Linear correlation coefficients for the T1, T1$ρ$, and T2* relaxations calculated using the automatic and manual segmentations ranged between 0.90 and 0.97, and there were no associated differences between the estimated average meniscal relaxation parameters. The deep learning models achieved segmentation performance equivalent to the inter-observer variability of two radiologists. The proposed deep learning-based approach can be used to efficiently generate automatic segmentations and determine meniscal relaxations times. The method has the potential to help radiologists with the assessment of meniscal diseases, such as OA.

CVMay 27, 2019
Breast mass classification in ultrasound based on Kendall's shape manifold

Michal Byra, Michael Andre

Morphological features play an important role in breast mass classification in sonography. While benign breast masses tend to have a well-defined ellipsoidal contour, shape of malignant breast masses is commonly ill-defined and highly variable. Various handcrafted morphological features have been developed over the years to assess this phenomenon and help the radiologists differentiate benign and malignant masses. In this paper we propose an automatic approach to morphology analysis, we express shapes of breast masses as points on the Kendall's shape manifold. Next, we use the full Procrustes distance to develop support vector machine classifiers for breast mass differentiation. The usefulness of our method is demonstrated using a dataset of B-mode images collected from 163 breast masses. Our method achieved area under the receiver operating characteristic curve of 0.81. The proposed method can be used to assess shapes of breast masses in ultrasound without any feature engineering.

CVApr 6, 2018
Impact of ultrasound image reconstruction method on breast lesion classification with neural transfer learning

Michal Byra, Tomasz Sznajder, Danijel Korzinek et al.

Deep learning algorithms, especially convolutional neural networks, have become a methodology of choice in medical image analysis. However, recent studies in computer vision show that even a small modification of input image intensities may cause a deep learning model to classify the image differently. In medical imaging, the distribution of image intensities is related to applied image reconstruction algorithm. In this paper we investigate the impact of ultrasound image reconstruction method on breast lesion classification with neural transfer learning. Due to high dynamic range raw ultrasonic signals are commonly compressed in order to reconstruct B-mode images. Based on raw data acquired from breast lesions, we reconstruct B-mode images using different compression levels. Next, transfer learning is applied for classification. Differently reconstructed images are employed for training and evaluation. We show that the modification of the reconstruction algorithm leads to decrease of classification performance. As a remedy, we propose a method of data augmentation. We show that the augmentation of the training set with differently reconstructed B-mode images leads to a more robust and efficient classification. Our study suggests that it is important to take into account image reconstruction algorithms implemented in medical scanners during development of computer aided diagnosis systems.