Szymon Łukasik

CV
h-index10
14papers
70citations
Novelty31%
AI Score39

14 Papers

CVSep 16, 2024
SoccerNet 2024 Challenges Results

Anthony Cioppa, Silvio Giancola, Vladimir Somers et al.

The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team. These challenges aim to advance research across multiple themes in football, including broadcast video understanding, field understanding, and player understanding. This year, the challenges encompass four vision-based tasks. (1) Ball Action Spotting, focusing on precisely localizing when and which soccer actions related to the ball occur, (2) Dense Video Captioning, focusing on describing the broadcast with natural language and anchored timestamps, (3) Multi-View Foul Recognition, a novel task focusing on analyzing multiple viewpoints of a potential foul incident to classify whether a foul occurred and assess its severity, (4) Game State Reconstruction, another novel task focusing on reconstructing the game state from broadcast videos onto a 2D top-view map of the field. Detailed information about the tasks, challenges, and leaderboards can be found at https://www.soccer-net.org, with baselines and development kits available at https://github.com/SoccerNet.

LGDec 20, 2022
Graph Neural Networks in Computer Vision -- Architectures, Datasets and Common Approaches

Maciej Krzywda, Szymon Łukasik, Amir H. Gandomi

Graph Neural Networks (GNNs) are a family of graph networks inspired by mechanisms existing between nodes on a graph. In recent years there has been an increased interest in GNN and their derivatives, i.e., Graph Attention Networks (GAT), Graph Convolutional Networks (GCN), and Graph Recurrent Networks (GRN). An increase in their usability in computer vision is also observed. The number of GNN applications in this field continues to expand; it includes video analysis and understanding, action and behavior recognition, computational photography, image and video synthesis from zero or few shots, and many more. This contribution aims to collect papers published about GNN-based approaches towards computer vision. They are described and summarized from three perspectives. Firstly, we investigate the architectures of Graph Neural Networks and their derivatives used in this area to provide accurate and explainable recommendations for the ensuing investigations. As for the other aspect, we also present datasets used in these works. Finally, using graph analysis, we also examine relations between GNN-based studies in computer vision and potential sources of inspiration identified outside of this field.

IRAug 10, 2022
Identifying Substitute and Complementary Products for Assortment Optimization with Cleora Embeddings

Sergiy Tkachuk, Anna Wróblewska, Jacek Dąbrowski et al.

Recent years brought an increasing interest in the application of machine learning algorithms in e-commerce, omnichannel marketing, and the sales industry. It is not only to the algorithmic advances but also to data availability, representing transactions, users, and background product information. Finding products related in different ways, i.e., substitutes and complements is essential for users' recommendations at the vendor's site and for the vendor - to perform efficient assortment optimization. The paper introduces a novel method for finding products' substitutes and complements based on the graph embedding Cleora algorithm. We also provide its experimental evaluation with regards to the state-of-the-art Shopper algorithm, studying the relevance of recommendations with surveys from industry experts. It is concluded that the new approach presented here offers suitable choices of recommended products, requiring a minimal amount of additional information. The algorithm can be used in various enterprises, effectively identifying substitute and complementary product options.

CLMay 31, 2022
Multilingual Transformers for Product Matching -- Experiments and a New Benchmark in Polish

Michał Możdżonek, Anna Wróblewska, Sergiy Tkachuk et al.

Product matching corresponds to the task of matching identical products across different data sources. It typically employs available product features which, apart from being multimodal, i.e., comprised of various data types, might be non-homogeneous and incomplete. The paper shows that pre-trained, multilingual Transformer models, after fine-tuning, are suitable for solving the product matching problem using textual features both in English and Polish languages. We tested multilingual mBERT and XLM-RoBERTa models in English on Web Data Commons - training dataset and gold standard for large-scale product matching. The obtained results show that these models perform similarly to the latest solutions tested on this set, and in some cases, the results were even better. Additionally, we prepared a new dataset entirely in Polish and based on offers in selected categories obtained from several online stores for the research purpose. It is the first open dataset for product matching tasks in Polish, which allows comparing the effectiveness of the pre-trained models. Thus, we also showed the baseline results obtained by the fine-tuned mBERT and XLM-RoBERTa models on the Polish datasets.

CVSep 21, 2023
Survey of Action Recognition, Spotting and Spatio-Temporal Localization in Soccer -- Current Trends and Research Perspectives

Karolina Seweryn, Anna Wróblewska, Szymon Łukasik

Action scene understanding in soccer is a challenging task due to the complex and dynamic nature of the game, as well as the interactions between players. This article provides a comprehensive overview of this task divided into action recognition, spotting, and spatio-temporal action localization, with a particular emphasis on the modalities used and multimodal methods. We explore the publicly available data sources and metrics used to evaluate models' performance. The article reviews recent state-of-the-art methods that leverage deep learning techniques and traditional methods. We focus on multimodal methods, which integrate information from multiple sources, such as video and audio data, and also those that represent one source in various ways. The advantages and limitations of methods are discussed, along with their potential for improving the accuracy and robustness of models. Finally, the article highlights some of the open research questions and future directions in the field of soccer action recognition, including the potential for multimodal methods to advance this field. Overall, this survey provides a valuable resource for researchers interested in the field of action scene understanding in soccer.

CVFeb 4
ImmuVis: Hyperconvolutional Foundation Model for Imaging Mass Cytometry

Marcin Możejko, Dawid Uchal, Krzysztof Gogolewski et al.

We present ImmuVis, an efficient convolutional foundation model for imaging mass cytometry (IMC), a high-throughput multiplex imaging technology that handles molecular marker measurements as image channels and enables large-scale spatial tissue profiling. Unlike natural images, multiplex imaging lacks a fixed channel space, as real-world marker sets vary across studies, violating a core assumption of standard vision backbones. To address this, ImmuVis introduces marker-adaptive hyperconvolutions that generate convolutional kernels from learned marker embeddings, enabling a single model to operate on arbitrary measured marker subsets without retraining. We pretrain ImmuVis on the largest to-date dataset, IMC17M (28 cohorts, 24,405 images, 265 markers, over 17M patches), using self-supervised masked reconstruction. ImmuVis outperforms SOTA baselines and ablations in virtual staining and downstream classification tasks at substantially lower compute cost than transformer-based alternatives, and is the sole model that provides calibrated uncertainty via a heteroscedastic likelihood objective. These results position ImmuVis as a practical, efficient foundation model for real-world IMC modeling.

NESep 30, 2024
Cartesian Genetic Programming Approach for Designing Convolutional Neural Networks

Maciej Krzywda, Szymon Łukasik, Amir Gandomi H

The present study covers an approach to neural architecture search (NAS) using Cartesian genetic programming (CGP) for the design and optimization of Convolutional Neural Networks (CNNs). In designing artificial neural networks, one crucial aspect of the innovative approach is suggesting a novel neural architecture. Currently used architectures have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In this work, we use pure Genetic Programming Approach to design CNNs, which employs only one genetic operation, i.e., mutation. In the course of preliminary experiments, our methodology yields promising results.

LGAug 7, 2024
Consumer Transactions Simulation through Generative Adversarial Networks

Sergiy Tkachuk, Szymon Łukasik, Anna Wróblewska

In the rapidly evolving domain of large-scale retail data systems, envisioning and simulating future consumer transactions has become a crucial area of interest. It offers significant potential to fortify demand forecasting and fine-tune inventory management. This paper presents an innovative application of Generative Adversarial Networks (GANs) to generate synthetic retail transaction data, specifically focusing on a novel system architecture that combines consumer behavior modeling with stock-keeping unit (SKU) availability constraints to address real-world assortment optimization challenges. We diverge from conventional methodologies by integrating SKU data into our GAN architecture and using more sophisticated embedding methods (e.g., hyper-graphs). This design choice enables our system to generate not only simulated consumer purchase behaviors but also reflects the dynamic interplay between consumer behavior and SKU availability -- an aspect often overlooked, among others, because of data scarcity in legacy retail simulation models. Our GAN model generates transactions under stock constraints, pioneering a resourceful experimental system with practical implications for real-world retail operation and strategy. Preliminary results demonstrate enhanced realism in simulated transactions measured by comparing generated items with real ones using methods employed earlier in related studies. This underscores the potential for more accurate predictive modeling.

CVJan 31, 2024
Improving Object Detection Quality in Football Through Super-Resolution Techniques

Karolina Seweryn, Gabriel Chęć, Szymon Łukasik et al.

This study explores the potential of super-resolution techniques in enhancing object detection accuracy in football. Given the sport's fast-paced nature and the critical importance of precise object (e.g. ball, player) tracking for both analysis and broadcasting, super-resolution could offer significant improvements. We investigate how advanced image processing through super-resolution impacts the accuracy and reliability of object detection algorithms in processing football match footage. Our methodology involved applying state-of-the-art super-resolution techniques to a diverse set of football match videos from SoccerNet, followed by object detection using Faster R-CNN. The performance of these algorithms, both with and without super-resolution enhancement, was rigorously evaluated in terms of detection accuracy. The results indicate a marked improvement in object detection accuracy when super-resolution preprocessing is applied. The improvement of object detection through the integration of super-resolution techniques yields significant benefits, especially for low-resolution scenarios, with a notable 12\% increase in mean Average Precision (mAP) at an IoU (Intersection over Union) range of 0.50:0.95 for 320x240 size images when increasing the resolution fourfold using RLFN. As the dimensions increase, the magnitude of improvement becomes more subdued; however, a discernible improvement in the quality of detection is consistently evident. Additionally, we discuss the implications of these findings for real-time sports analytics, player tracking, and the overall viewing experience. The study contributes to the growing field of sports technology by demonstrating the practical benefits and limitations of integrating super-resolution techniques in football analytics and broadcasting.

CYDec 13, 2023
Culturally Responsive Artificial Intelligence -- Problems, Challenges and Solutions

Natalia Ożegalska-Łukasik, Szymon Łukasik

In the contemporary interconnected world, the concept of cultural responsibility occupies paramount importance. As the lines between nations become less distinct, it is incumbent upon individuals, communities, and institutions to assume the responsibility of safeguarding and valuing the landscape of diverse cultures that constitute our global society. This paper explores the socio-cultural and ethical challenges stemming from the implementation of AI algorithms and highlights the necessity for their culturally responsive development. It also offers recommendations on essential elements required to enhance AI systems' adaptability to meet the demands of contemporary multicultural societies. The paper highlights the need for further multidisciplinary research to create AI models that effectively address these challenges. It also advocates the significance of AI enculturation and underlines the importance of regulatory measures to promote cultural responsibility in AI systems.

LGJan 10, 2025
Kolmogorov-Arnold networks for metal surface defect classification

Maciej Krzywda, Mariusz Wermiński, Szymon Łukasik et al.

This paper presents the application of Kolmogorov-Arnold Networks (KAN) in classifying metal surface defects. Specifically, steel surfaces are analyzed to detect defects such as cracks, inclusions, patches, pitted surfaces, and scratches. Drawing on the Kolmogorov-Arnold theorem, KAN provides a novel approach compared to conventional multilayer perceptrons (MLPs), facilitating more efficient function approximation by utilizing spline functions. The results show that KAN networks can achieve better accuracy than convolutional neural networks (CNNs) with fewer parameters, resulting in faster convergence and improved performance in image classification.

DBFeb 1, 2024
Text-Based Product Matching -- Semi-Supervised Clustering Approach

Alicja Martinek, Szymon Łukasik, Amir H. Gandomi

Matching identical products present in multiple product feeds constitutes a crucial element of many tasks of e-commerce, such as comparing product offerings, dynamic price optimization, and selecting the assortment personalized for the client. It corresponds to the well-known machine learning task of entity matching, with its own specificity, like omnipresent unstructured data or inaccurate and inconsistent product descriptions. This paper aims to present a new philosophy to product matching utilizing a semi-supervised clustering approach. We study the properties of this method by experimenting with the IDEC algorithm on the real-world dataset using predominantly textual features and fuzzy string matching, with more standard approaches as a point of reference. Encouraging results show that unsupervised matching, enriched with a small annotated sample of product links, could be a possible alternative to the dominant supervised strategy, requiring extensive manual data labeling.

CYSep 29, 2025
Artificial Authority: From Machine Minds to Political Alignments. An Experimental Analysis of Democratic and Autocratic Biases in Large-Language Models

Natalia Ożegalska-Łukasik, Szymon Łukasik

Political beliefs vary significantly across different countries, reflecting distinct historical, cultural, and institutional contexts. These ideologies, ranging from liberal democracies to rigid autocracies, influence human societies, as well as the digital systems that are constructed within those societies. The advent of generative artificial intelligence, particularly Large Language Models (LLMs), introduces new agents in the political space-agents trained on massive corpora that replicate and proliferate socio-political assumptions. This paper analyses whether LLMs display propensities consistent with democratic or autocratic world-views. We validate this insight through experimental tests in which we experiment with the leading LLMs developed across disparate political contexts, using several existing psychometric and political orientation measures. The analysis is based on both numerical scoring and qualitative analysis of the models' responses. Findings indicate high model-to-model variability and a strong association with the political culture of the country in which the model was developed. These findings highlight the need for more detailed examination of the socio-political dimensions embedded within AI systems.

CLJun 19, 2025
PL-Guard: Benchmarking Language Model Safety for Polish

Aleksandra Krasnodębska, Karolina Seweryn, Szymon Łukasik et al.

Despite increasing efforts to ensure the safety of large language models (LLMs), most existing safety assessments and moderation tools remain heavily biased toward English and other high-resource languages, leaving majority of global languages underexamined. To address this gap, we introduce a manually annotated benchmark dataset for language model safety classification in Polish. We also create adversarially perturbed variants of these samples designed to challenge model robustness. We conduct a series of experiments to evaluate LLM-based and classifier-based models of varying sizes and architectures. Specifically, we fine-tune three models: Llama-Guard-3-8B, a HerBERT-based classifier (a Polish BERT derivative), and PLLuM, a Polish-adapted Llama-8B model. We train these models using different combinations of annotated data and evaluate their performance, comparing it against publicly available guard models. Results demonstrate that the HerBERT-based classifier achieves the highest overall performance, particularly under adversarial conditions.