LGJul 16, 2022
On the Importance of Hyperparameters and Data Augmentation for Self-Supervised LearningDiane Wagner, Fabio Ferreira, Danny Stoll et al.
Self-Supervised Learning (SSL) has become a very active area of Deep Learning research where it is heavily used as a pre-training method for classification and other tasks. However, the rapid pace of advancements in this area comes at a price: training pipelines vary significantly across papers, which presents a potentially crucial confounding factor. Here, we show that, indeed, the choice of hyperparameters and data augmentation strategies can have a dramatic impact on performance. To shed light on these neglected factors and help maximize the power of SSL, we hyperparameterize these components and optimize them with Bayesian optimization, showing improvements across multiple datasets for the SimSiam SSL approach. Realizing the importance of data augmentations for SSL, we also introduce a new automated data augmentation algorithm, GroupAugment, which considers groups of augmentations and optimizes the sampling across groups. In contrast to algorithms designed for supervised learning, GroupAugment achieved consistently high linear evaluation accuracy across all datasets we considered. Overall, our results indicate the importance and likely underestimated role of data augmentation for SSL.
LGDec 20, 2022
Deep Riemannian Networks for End-to-End EEG DecodingDaniel Wilson, Robin Tibor Schirrmeister, Lukas Alexander Wilhelm Gemein et al.
State-of-the-art performance in electroencephalography (EEG) decoding tasks is currently often achieved with either Deep-Learning (DL) or Riemannian-Geometry-based decoders (RBDs). Recently, there is growing interest in Deep Riemannian Networks (DRNs) possibly combining the advantages of both previous classes of methods. However, there are still a range of topics where additional insight is needed to pave the way for a more widespread application of DRNs in EEG. These include architecture design questions such as network size and end-to-end ability. How these factors affect model performance has not been explored. Additionally, it is not clear how the data within these networks is transformed, and whether this would correlate with traditional EEG decoding. Our study aims to lay the groundwork in the area of these topics through the analysis of DRNs for EEG with a wide range of hyperparameters. Networks were tested on five public EEG datasets and compared with state-of-the-art ConvNets. Here we propose EE(G)-SPDNet, and we show that this wide, end-to-end DRN can outperform the ConvNets, and in doing so use physiologically plausible frequency regions. We also show that the end-to-end approach learns more complex filters than traditional band-pass filters targeting the classical alpha, beta, and gamma frequency bands of the EEG, and that performance can benefit from channel specific filtering approaches. Additionally, architectural analysis revealed areas for further improvement due to the possible under utilisation of Riemannian specific information throughout the network. Our study thus shows how to design and train DRNs to infer task-related information from the raw EEG without the need of handcrafted filterbanks and highlights the potential of end-to-end DRNs such as EE(G)-SPDNet for high-performance EEG decoding.
LGFeb 17, 2024Code
TuneTables: Context Optimization for Scalable Prior-Data Fitted NetworksBenjamin Feuer, Robin Tibor Schirrmeister, Valeriia Cherepanova et al.
While tabular classification has traditionally relied on from-scratch training, a recent breakthrough called prior-data fitted networks (PFNs) challenges this approach. Similar to large language models, PFNs make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. However, current PFNs have limitations that prohibit their widespread adoption. Notably, TabPFN achieves very strong performance on small tabular datasets but is not designed to make predictions for datasets of size larger than 1000. In this work, we overcome these limitations and substantially improve the performance of PFNs via context optimization. We introduce TuneTables, a parameter-efficient fine-tuning strategy for PFNs that compresses large datasets into a smaller learned context. We conduct extensive experiments on 19 algorithms over 98 datasets and find that TuneTables achieves the best performance on average, outperforming boosted trees such as CatBoost, while optimizing fewer than 5% of TabPFN's parameters. Furthermore, we show that TuneTables can be used as an interpretability tool and can even be used to mitigate biases by optimizing a fairness objective. We open-source our code and raw results at https://github.com/penfever/TuneTables.
CLMar 18, 2025Code
EEG-CLIP : Learning EEG representations from natural language descriptionsTidiane Camaret Ndir, Robin Tibor Schirrmeister, Tonio Ball
Deep networks for electroencephalogram (EEG) decoding are often only trained to solve one specific task, such as pathology or age decoding. A more general task-agnostic approach is to train deep networks to match a (clinical) EEG recording to its corresponding textual medical report and vice versa. This approach was pioneered in the computer vision domain matching images and their text captions and subsequently allowed to do successful zero-shot decoding using textual class prompts. In this work, we follow this approach and develop a contrastive learning framework, EEG-CLIP, that aligns the EEG time series and the descriptions of the corresponding clinical text in a shared embedding space. We investigated its potential for versatile EEG decoding, evaluating performance in a range of few-shot and zero-shot settings. Overall, we show that EEG-CLIP manages to non-trivially align text and EEG representations. Our work presents a promising approach to learn general EEG representations, which could enable easier analyses of diverse decoding questions through zero-shot decoding or training task-specific models from fewer training examples. The code for reproducing our results is available at https://github.com/tidiane-camaret/EEGClip
LGJun 18, 2020Code
Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and FeaturesRobin Tibor Schirrmeister, Yuxuan Zhou, Tonio Ball et al.
Deep generative networks trained via maximum likelihood on a natural image dataset like CIFAR10 often assign high likelihoods to images from datasets with different objects (e.g., SVHN). We refine previous investigations of this failure at anomaly detection for invertible generative networks and provide a clear explanation of it as a combination of model bias and domain prior: Convolutional networks learn similar low-level feature distributions when trained on any natural image dataset and these low-level features dominate the likelihood. Hence, when the discriminative features between inliers and outliers are on a high-level, e.g., object shapes, anomaly detection becomes particularly challenging. To remove the negative impact of model bias and domain prior on detecting high-level differences, we propose two methods, first, using the log likelihood ratios of two identical models, one trained on the in-distribution data (e.g., CIFAR10) and the other one on a more general distribution of images (e.g., 80 Million Tiny Images). We also derive a novel outlier loss for the in-distribution network on samples from the more general distribution to further improve the performance. Secondly, using a multi-scale model like Glow, we show that low-level features are mainly captured at early scales. Therefore, using only the likelihood contribution of the final scale performs remarkably well for detecting high-level feature differences of the out-of-distribution and the in-distribution. This method is especially useful if one does not have access to a suitable general distribution. Overall, our methods achieve strong anomaly detection performance in the unsupervised setting, and only slightly underperform state-of-the-art classifier-based methods in the supervised setting. Code can be found at https://github.com/boschresearch/hierarchical_anomaly_detection.
IVFeb 11, 2020Code
Machine-Learning-Based Diagnostics of EEG PathologyLukas Alexander Wilhelm Gemein, Robin Tibor Schirrmeister, Patryk Chrabąszcz et al.
Machine learning (ML) methods have the potential to automate clinical EEG analysis. They can be categorized into feature-based (with handcrafted features), and end-to-end approaches (with learned features). Previous studies on EEG pathology decoding have typically analyzed a limited number of features, decoders, or both. For a I) more elaborate feature-based EEG analysis, and II) in-depth comparisons of both approaches, here we first develop a comprehensive feature-based framework, and then compare this framework to state-of-the-art end-to-end methods. To this aim, we apply the proposed feature-based framework and deep neural networks including an EEG-optimized temporal convolutional network (TCN) to the task of pathological versus non-pathological EEG classification. For a robust comparison, we chose the Temple University Hospital (TUH) Abnormal EEG Corpus (v2.0.0), which contains approximately 3000 EEG recordings. The results demonstrate that the proposed feature-based decoding framework can achieve accuracies on the same level as state-of-the-art deep neural networks. We find accuracies across both approaches in an astonishingly narrow range from 81--86\%. Moreover, visualizations and analyses indicated that both approaches used similar aspects of the data, e.g., delta and theta band power at temporal electrode locations. We argue that the accuracies of current binary EEG pathology decoders could saturate near 90\% due to the imperfect inter-rater agreement of the clinical labels, and that such decoders are already clinically useful, such as in areas where clinical EEG experts are rare. We make the proposed feature-based framework available open source and thus offer a new tool for EEG machine learning research.
LGJun 5, 2018Code
Training Generative Reversible NetworksRobin Tibor Schirrmeister, Patryk Chrabąszcz, Frank Hutter et al.
Generative models with an encoding component such as autoencoders currently receive great interest. However, training of autoencoders is typically complicated by the need to train a separate encoder and decoder model that have to be enforced to be reciprocal to each other. To overcome this problem, by-design reversible neural networks (RevNets) had been previously used as generative models either directly optimizing the likelihood of the data under the model or using an adversarial approach on the generated data. Here, we instead investigate their performance using an adversary on the latent space in the adversarial autoencoder framework. We investigate the generative performance of RevNets on the CelebA dataset, showing that generative RevNets can generate coherent faces with similar quality as Variational Autoencoders. This first attempt to use RevNets inside the adversarial autoencoder framework slightly underperformed relative to recent advanced generative models using an autoencoder component on CelebA, but this gap may diminish with further optimization of the training setup of generative RevNets. In addition to the experiments on CelebA, we show a proof-of-principle experiment on the MNIST dataset suggesting that adversary-free trained RevNets can discover meaningful latent dimensions without pre-specifying the number of dimensions of the latent sampling distribution. In summary, this study shows that RevNets can be employed in different generative training settings. Source code for this study is at https://github.com/robintibor/generative-reversible
LGMar 15, 2017Code
Deep learning with convolutional neural networks for EEG decoding and visualizationRobin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer et al.
PLEASE READ AND CITE THE REVISED VERSION at Human Brain Mapping: http://onlinelibrary.wiley.com/doi/10.1002/hbm.23730/full Code available here: https://github.com/robintibor/braindecode
CVOct 3, 2025
Dynamic Prompt Generation for Interactive 3D Medical Image Segmentation TrainingTidiane Camaret Ndir, Alexander Pfefferle, Robin Tibor Schirrmeister
Interactive 3D biomedical image segmentation requires efficient models that can iteratively refine predictions based on user prompts. Current foundation models either lack volumetric awareness or suffer from limited interactive capabilities. We propose a training strategy that combines dynamic volumetric prompt generation with content-aware adaptive cropping to optimize the use of the image encoder. Our method simulates realistic user interaction patterns during training while addressing the computational challenges of learning from sequential refinement feedback on a single GPU. For efficient training, we initialize our network using the publicly available weights from the nnInteractive segmentation model. Evaluation on the \textbf{Foundation Models for Interactive 3D Biomedical Image Segmentation} competition demonstrates strong performance with an average final Dice score of 0.6385, normalized surface distance of 0.6614, and area-under-the-curve metrics of 2.4799 (Dice) and 2.5671 (NSD).
CLJan 9, 2025
Unlocking In-Context Learning for Natural Datasets Beyond Language ModellingJelena Bratulić, Sudhanshu Mittal, David T. Hoffmann et al.
Large Language Models (LLMs) exhibit In-Context Learning (ICL), which enables the model to perform new tasks conditioning only on the examples provided in the context without updating the model's weights. While ICL offers fast adaptation across natural language tasks and domains, its emergence is less straightforward for modalities beyond text. In this work, we systematically uncover properties present in LLMs that support the emergence of ICL for autoregressive models and various modalities by promoting the learning of the needed mechanisms for ICL. We identify exact token repetitions in the training data sequences as an important factor for ICL. Such repetitions further improve stability and reduce transiency in ICL performance. Moreover, we emphasise the significance of training task difficulty for the emergence of ICL. Finally, by applying our novel insights on ICL emergence, we unlock ICL capabilities for various visual datasets and a more challenging EEG classification task.
LGJan 14, 2022
When less is more: Simplifying inputs aids neural network understandingRobin Tibor Schirrmeister, Rosanne Liu, Sara Hooker et al.
How do neural network image classifiers respond to simpler and simpler inputs? And what do such responses reveal about the learning process? To answer these questions, we need a clear measure of input simplicity (or inversely, complexity), an optimization objective that correlates with simplification, and a framework to incorporate such objective into training and inference. Lastly we need a variety of testbeds to experiment and evaluate the impact of such simplification on learning. In this work, we measure simplicity with the encoding bit size given by a pretrained generative model, and minimize the bit size to simplify inputs in training and inference. We investigate the effect of such simplification in several scenarios: conventional training, dataset condensation and post-hoc explanations. In all settings, inputs are simplified along with the original classification task, and we investigate the trade-off between input simplicity and task performance. For images with injected distractors, such simplification naturally removes superfluous information. For dataset condensation, we find that inputs can be simplified with almost no accuracy degradation. When used in post-hoc explanation, our learning-based simplification approach offers a valuable new tool to explore the basis of network decisions.
LGJul 17, 2019
Deep Invertible Networks for EEG-based brain-signal decodingRobin Tibor Schirrmeister, Tonio Ball
In this manuscript, we investigate deep invertible networks for EEG-based brain signal decoding and find them to generate realistic EEG signals as well as classify novel signals above chance. Further ideas for their regularization towards better decoding accuracies are discussed.
HCJul 4, 2018
The role of robot design in decoding error-related information from EEG signals of a human observerJoos Behncke, Robin Tibor Schirrmeister, Wolfram Burgard et al.
For utilization of robotic assistive devices in everyday life, means for detection and processing of erroneous robot actions are a focal aspect in the development of collaborative systems, especially when controlled via brain signals. Though, the variety of possible scenarios and the diversity of used robotic systems pose a challenge for error decoding from recordings of brain signals such as via EEG. For example, it is unclear whether humanoid appearances of robotic assistants have an influence on the performance. In this paper, we designed a study in which two different robots executed the same task both in an erroneous and a correct manner. We find error-related EEG signals of human observers indicating that the performance of the error decoding was independent of robot design. However, we can show that it was possible to identify which robot performed the instructed task by means of the EEG signals. In this case, deep convolutional neural networks (deep ConvNets) could reach significantly higher accuracies than both regularized Linear Discriminanat Analysis (rLDA) and filter bank common spatial patterns (FB-CSP) combined with rLDA. Our findings indicate that decoding information about robot action success from the EEG, particularly when using deep neural networks, may be an applicable approach for a broad range of robot designs.
NCJun 20, 2018
Cross-paradigm pretraining of convolutional networks improves intracranial EEG decodingJoos Behncke, Robin Tibor Schirrmeister, Martin Völker et al.
When it comes to the classification of brain signals in real-life applications, the training and the prediction data are often described by different distributions. Furthermore, diverse data sets, e.g., recorded from various subjects or tasks, can even exhibit distinct feature spaces. The fact that data that have to be classified are often only available in small amounts reinforces the need for techniques to generalize learned information, as performances of brain-computer interfaces (BCIs) are enhanced by increasing quantity of available data. In this paper, we apply transfer learning to a framework based on deep convolutional neural networks (deep ConvNets) to prove the transferability of learned patterns in error-related brain signals across different tasks. The experiments described in this paper demonstrate the usefulness of transfer learning, especially improving performances when only little data can be used to distinguish between erroneous and correct realization of a task. This effect could be delimited from a transfer of merely general brain signal characteristics, underlining the transfer of error-specific information. Furthermore, we could extract similar patterns in time-frequency analyses in identical channels, leading to selective high signal correlations between the two different paradigms. Classification on the intracranial data yields in median accuracies up to $(81.50 \pm 9.49)\,\%$. Decoding on only $10\%$ of the data without pre-training reaches performances of $(54.76 \pm 3.56)\,\%$, compared to $(64.95 \pm 0.79)\,\%$ with pre-training.
SPJun 5, 2018
EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signalsKay Gregor Hartmann, Robin Tibor Schirrmeister, Tonio Ball
Generative adversarial networks (GANs) are recently highly successful in generative applications involving images and start being applied to time series data. Here we describe EEG-GAN as a framework to generate electroencephalographic (EEG) brain signals. We introduce a modification to the improved training of Wasserstein GANs to stabilize training and investigate a range of architectural choices critical for time series generation (most notably up- and down-sampling). For evaluation we consider and compare different metrics such as Inception score, Frechet inception distance and sliced Wasserstein distance, together showing that our EEG-GAN framework generated naturalistic EEG examples. It thus opens up a range of new generative application scenarios in the neuroscientific and neurological context, such as data augmentation in brain-computer interfacing tasks, EEG super-sampling, or restoration of corrupted data segments. The possibility to generate signals of a certain class and/or with specific properties may also open a new avenue for research into the underlying structure of brain signals.
LGNov 21, 2017
Hierarchical internal representation of spectral features in deep convolutional networks trained for EEG decodingKay Gregor Hartmann, Robin Tibor Schirrmeister, Tonio Ball
Recently, there is increasing interest and research on the interpretability of machine learning models, for example how they transform and internally represent EEG signals in Brain-Computer Interface (BCI) applications. This can help to understand the limits of the model and how it may be improved, in addition to possibly provide insight about the data itself. Schirrmeister et al. (2017) have recently reported promising results for EEG decoding with deep convolutional neural networks (ConvNets) trained in an end-to-end manner and, with a causal visualization approach, showed that they learn to use spectral amplitude changes in the input. In this study, we investigate how ConvNets represent spectral features through the sequence of intermediate stages of the network. We show higher sensitivity to EEG phase features at earlier stages and higher sensitivity to EEG amplitude features at later stages. Intriguingly, we observed a specialization of individual stages of the network to the classical EEG frequency bands alpha, beta, and high gamma. Furthermore, we find first evidence that particularly in the last convolutional layer, the network learns to detect more complex oscillatory patterns beyond spectral phase and amplitude, reminiscent of the representation of complex visual features in later layers of ConvNets in computer vision tasks. Our findings thus provide insights into how ConvNets hierarchically represent spectral EEG features in their intermediate layers and suggest that ConvNets can exploit and might help to better understand the compositional structure of EEG time series.
HCNov 16, 2017
The signature of robot action success in EEG signals of a human observer: Decoding and visualization using deep convolutional neural networksJoos Behncke, Robin Tibor Schirrmeister, Wolfram Burgard et al.
The importance of robotic assistive devices grows in our work and everyday life. Cooperative scenarios involving both robots and humans require safe human-robot interaction. One important aspect here is the management of robot errors, including fast and accurate online robot-error detection and correction. Analysis of brain signals from a human interacting with a robot may help identifying robot errors, but accuracies of such analyses have still substantial space for improvement. In this paper we evaluate whether a novel framework based on deep convolutional neural networks (deep ConvNets) could improve the accuracy of decoding robot errors from the EEG of a human observer, both during an object grasping and a pouring task. We show that deep ConvNets reached significantly higher accuracies than both regularized Linear Discriminant Analysis (rLDA) and filter bank common spatial patterns (FB-CSP) combined with rLDA, both widely used EEG classifiers. Deep ConvNets reached mean accuracies of 75% +/- 9 %, rLDA 65% +/- 10% and FB-CSP + rLDA 63% +/- 6% for decoding of erroneous vs. correct trials. Visualization of the time-domain EEG features learned by the ConvNets to decode errors revealed spatiotemporal patterns that reflected differences between the two experimental paradigms. Across subjects, ConvNet decoding accuracies were significantly correlated with those obtained with rLDA, but not CSP, indicating that in the present context ConvNets behaved more 'rLDA-like' (but consistently better), while in a previous decoding study with another task but the same ConvNet architecture, it was found to behave more 'CSP-like'. Our findings thus provide further support for the assumption that deep ConvNets are a versatile addition to the existing toolbox of EEG decoding techniques, and we discuss steps how ConvNet EEG decoding performance could be further optimized.
LGAug 26, 2017
Deep learning with convolutional neural networks for decoding and visualization of EEG pathologyRobin Tibor Schirrmeister, Lukas Gemein, Katharina Eggensperger et al.
We apply convolutional neural networks (ConvNets) to the task of distinguishing pathological from normal EEG recordings in the Temple University Hospital EEG Abnormal Corpus. We use two basic, shallow and deep ConvNet architectures recently shown to decode task-related information from EEG at least as well as established algorithms designed for this purpose. In decoding EEG pathology, both ConvNets reached substantially better accuracies (about 6% better, ~85% vs. ~79%) than the only published result for this dataset, and were still better when using only 1 minute of each recording for training and only six seconds of each recording for testing. We used automated methods to optimize architectural hyperparameters and found intriguingly different ConvNet architectures, e.g., with max pooling as the only nonlinearity. Visualizations of the ConvNet decoding behavior showed that they used spectral power changes in the delta (0-4 Hz) and theta (4-8 Hz) frequency range, possibly alongside other features, consistent with expectations derived from spectral analysis of the EEG data and from the textual medical reports. Analysis of the textual medical reports also highlighted the potential for accuracy increases by integrating contextual information, such as the age of subjects. In summary, the ConvNets and visualization techniques used in this study constitute a next step towards clinically useful automated EEG diagnosis and establish a new baseline for future work on this topic.
HCAug 4, 2017
Brain Responses During Robot-Error ObservationDominik Welke, Joos Behncke, Marina Hader et al.
Brain-controlled robots are a promising new type of assistive device for severely impaired persons. Little is however known about how to optimize the interaction of humans and brain-controlled robots. Information about the human's perceived correctness of robot performance might provide a useful teaching signal for adaptive control algorithms and thus help enhancing robot control. Here, we studied whether watching robots perform erroneous vs. correct action elicits differential brain responses that can be decoded from single trials of electroencephalographic (EEG) recordings, and whether brain activity during human-robot interaction is modulated by the robot's visual similarity to a human. To address these topics, we designed two experiments. In experiment I, participants watched a robot arm pour liquid into a cup. The robot performed the action either erroneously or correctly, i.e. it either spilled some liquid or not. In experiment II, participants observed two different types of robots, humanoid and non-humanoid, grabbing a ball. The robots either managed to grab the ball or not. We recorded high-resolution EEG during the observation tasks in both experiments to train a Filter Bank Common Spatial Pattern (FBCSP) pipeline on the multivariate EEG signal and decode for the correctness of the observed action, and for the type of the observed robot. Our findings show that it was possible to decode both correctness and robot type for the majority of participants significantly, although often just slightly, above chance level. Our findings suggest that non-invasive recordings of brain responses elicited when observing robots indeed contain decodable information about the correctness of the robot's action and the type of observed robot.
AIJul 20, 2017
Acting Thoughts: Towards a Mobile Robotic Service Assistant for Users with Limited Communication SkillsFelix Burget, Lukas Dominique Josef Fiederer, Daniel Kuhner et al.
As autonomous service robots become more affordable and thus available also for the general public, there is a growing need for user friendly interfaces to control the robotic system. Currently available control modalities typically expect users to be able to express their desire through either touch, speech or gesture commands. While this requirement is fulfilled for the majority of users, paralyzed users may not be able to use such systems. In this paper, we present a novel framework, that allows these users to interact with a robotic service assistant in a closed-loop fashion, using only thoughts. The brain-computer interface (BCI) system is composed of several interacting components, i.e., non-invasive neuronal signal recording and decoding, high-level task planning, motion and manipulation planning as well as environment perception. In various experiments, we demonstrate its applicability and robustness in real world scenarios, considering fetch-and-carry tasks and tasks involving human-robot interaction. As our results demonstrate, our system is capable of adapting to frequent changes in the environment and reliably completing given tasks within a reasonable amount of time. Combined with high-level planning and autonomous robotic systems, interesting new perspectives open up for non-invasive BCI-based human-robot interactions.