Mykhailo Koshil

LG
h-index3
4papers
3citations
Novelty43%
AI Score41

4 Papers

LGMay 7Code
Is One Layer Enough? Understanding Inference Dynamics in Tabular Foundation Models

Amir Rezaei Balef, Mykhailo Koshil, Katharina Eggensperger

Transformer-based tabular foundation models (TFMs) dominate small to medium tabular predictive benchmark tasks, yet their inference mechanisms remain largely unexplored. We present the first large-scale mechanistic study of layerwise dynamics in 6 state-of-the-art tabular in-context learning models. We explore how predictions emerge across depth, identify distinct stages of inference and reveal latent-space dynamics that differ from those of language models. Our findings indicate substantial depthwise redundancy across multiple models, suggesting iterative refinement with overlapping computations during inference stages. Guided by these insights, we design a proof-of-concept, looped single-layer model that uses only 20% of the original model's parameters while achieving comparable performance. The code is available at https://github.com/amirbalef/is_one_layer_enough.

CVAug 27, 2024
AnomalousPatchCore: Exploring the Use of Anomalous Samples in Industrial Anomaly Detection

Mykhailo Koshil, Tilman Wegener, Detlef Mentrup et al.

Visual inspection, or industrial anomaly detection, is one of the most common quality control types in manufacturing. The task is to identify the presence of an anomaly given an image, e.g., a missing component on an image of a circuit board, for subsequent manual inspection. While industrial anomaly detection has seen a surge in recent years, most anomaly detection methods still utilize knowledge only from normal samples, failing to leverage the information from the frequently available anomalous samples. Additionally, they heavily rely on very general feature extractors pre-trained on common image classification datasets. In this paper, we address these shortcomings and propose the new anomaly detection system AnomalousPatchCore~(APC) based on a feature extractor fine-tuned with normal and anomalous in-domain samples and a subsequent memory bank for identifying unusual features. To fine-tune the feature extractor in APC, we propose three auxiliary tasks that address the different aspects of anomaly detection~(classification vs. localization) and mitigate the effect of the imbalance between normal and anomalous samples. Our extensive evaluation on the MVTec dataset shows that APC outperforms state-of-the-art systems in detecting anomalies, which is especially important in industrial anomaly detection given the subsequent manual inspection. In detailed ablation studies, we further investigate the properties of our APC.

ROOct 18, 2022
Trixi the Librarian

Fabian Wieczorek, Shang-Ching Liu, Björn Sygo et al.

In this work, we present a three-part system that automatically sorts books on a shelf using the PR- 2 platform. The paper describes a methodology to sufficiently detect and recognize books using a multistep vision pipeline based on deep learning models as well as conventional computer vision. Furthermore, the difficulties of relocating books using a bi-manual robot along with solutions based on MoveIt and BioIK are being addressed. Experiments show that the performance is overall good enough to repeatedly sort three books on a shelf. Nevertheless, further improvements are being discussed, potentially leading to a more robust book recognition and more versatile manipulation techniques.

LGNov 19, 2025
Towards Understanding Layer Contributions in Tabular In-Context Learning Models

Amir Rezaei Balef, Mykhailo Koshil, Katharina Eggensperger

Despite the architectural similarities between tabular in-context learning (ICL) models and large language models (LLMs), little is known about how individual layers contribute to tabular prediction. In this paper, we investigate how the latent spaces evolve across layers in tabular ICL models, identify potential redundant layers, and compare these dynamics with those observed in LLMs. We analyze TabPFN and TabICL through the "layers as painters" perspective, finding that only subsets of layers share a common representational language, suggesting structural redundancy and offering opportunities for model compression and improved interpretability.