Thomas Probst

h-index40

10papers

121citations

Novelty36%

AI Score21

Ranked #181,374 of 194,257 authors (top 93%)#56,921 in CV (top 96%)

10 Papers

3.7CVJun 3, 2022

Gradient Obfuscation Checklist Test Gives a False Sense of Security

Nikola Popovic, Danda Pani Paudel, Thomas Probst et al.

One popular group of defense techniques against adversarial attacks is based on injecting stochastic noise into the network. The main source of robustness of such stochastic defenses however is often due to the obfuscation of the gradients, offering a false sense of security. Since most of the popular adversarial attacks are optimization-based, obfuscated gradients reduce their attacking ability, while the model is still susceptible to stronger or specifically tailored adversarial attacks. Recently, five characteristics have been identified, which are commonly observed when the improvement in robustness is mainly caused by gradient obfuscation. It has since become a trend to use these five characteristics as a sufficient test, to determine whether or not gradient obfuscation is the main source of robustness. However, these characteristics do not perfectly characterize all existing cases of gradient obfuscation, and therefore can not serve as a basis for a conclusive test. In this work, we present a counterexample, showing this test is not sufficient for concluding that gradient obfuscation is not the main cause of improvements in robustness.

2.6CVMar 25, 2022

Spatially Multi-conditional Image Generation

Ritika Chakraborty, Nikola Popovic, Danda Pani Paudel et al.

In most scenarios, conditional image generation can be thought of as an inversion of the image understanding process. Since generic image understanding involves solving multiple tasks, it is natural to aim at generating images via multi-conditioning. However, multi-conditional image generation is a very challenging problem due to the heterogeneity and the sparsity of the (in practice) available conditioning labels. In this work, we propose a novel neural architecture to address the problem of heterogeneity and sparsity of the spatially multi-conditional labels. Our choice of spatial conditioning, such as by semantics and depth, is driven by the promise it holds for better control of the image generation process. The proposed method uses a transformer-like architecture operating pixel-wise, which receives the available labels as input tokens to merge them in a learned homogeneous space of labels. The merged labels are then used for image generation via conditional generative adversarial training. In this process, the sparsity of the labels is handled by simply dropping the input tokens corresponding to the missing labels at the desired locations, thanks to the proposed pixel-wise operating architecture. Our experiments on three benchmark datasets demonstrate the clear superiority of our method over the state-of-the-art and compared baselines. The source code will be made publicly available.

1.4CVDec 30, 2021

Improving the Behaviour of Vision Transformers with Token-consistent Stochastic Layers

Nikola Popovic, Danda Pani Paudel, Thomas Probst et al.

We introduce token-consistent stochastic layers in vision transformers, without causing any severe drop in performance. The added stochasticity improves network calibration, robustness and strengthens privacy. We use linear layers with token-consistent stochastic parameters inside the multilayer perceptron blocks, without altering the architecture of the transformer. The stochastic parameters are sampled from the uniform distribution, both during training and inference. The applied linear operations preserve the topological structure, formed by the set of tokens passing through the shared multilayer perceptron. This operation encourages the learning of the recognition task to rely on the topological structures of the tokens, instead of their values, which in turn offers the desired robustness and privacy of the visual features. The effectiveness of the token-consistent stochasticity is demonstrated on three different applications, namely, network calibration, adversarial robustness, and feature privacy, by boosting the performance of the respective established baselines.

3.7HCNov 4, 2021

Defining Gaze Patterns for Process Model Literacy -- Exploring Visual Routines in Process Models with Diverse Mappings

Michael Winter, Heiko Neumann, Rüdiger Pryss et al.

Process models depict crucial artifacts for organizations regarding documentation, communication, and collaboration. The proper comprehension of such models is essential for an effective application. An important aspect in process model literacy constitutes the question how the information presented in process models is extracted and processed by the human visual system? For such visuospatial tasks, the visual system deploys a set of elemental operations, from whose compositions different visual routines are produced. This paper provides insights from an exploratory eye tracking study, in which visual routines during process model comprehension were contemplated. More specifically, n = 29 participants were asked to comprehend n = 18 process models expressed in the Business Process Model and Notation 2.0 reflecting diverse mappings (i.e., straight, upward, downward) and complexity levels. The performance measures indicated that even less complex process models pose a challenge regarding their comprehension. The upward mapping confronted participants' attention with more challenges, whereas the downward mapping was comprehended more effectively. Based on recorded eye movements, three gaze patterns applied during model comprehension were derived. Thereupon, we defined a general model which identifies visual routines and corresponding elemental operations during process model comprehension. Finally, implications for practice as well as research and directions for future work are discussed in this paper.

3.7HCJul 2, 2021

Are Non-Experts Able to Comprehend Business Process Models -- Study Insights Involving Novices and Experts

Michael Winter, Rüdiger Pryss, Thomas Probst et al.

The comprehension of business process models is crucial for enterprises. Prior research has shown that children as well as adolescents perceive and interpret graphical representations in a different manner compared to grown-ups. To evaluate this, observations in the context of business process models are presented in this paper obtained from a study on visual literacy in cultural education. We demonstrate that adolescents without expertise in process model comprehension are able to correctly interpret business process models expressed in terms of BPMN 2.0. In a comprehensive study, n = 205 learners (i.e., pupils at the age of 15) needed to answer questions related to process models they were confronted with, reflecting different levels of complexity. In addition, process models were created with varying styles of element labels. Study results indicate that an abstract description (i.e., using only alphabetic letters) of process models is understood more easily compared to concrete or pseudo} descriptions. As benchmark, results are compared with the ones of modeling experts (n = 40). Amongst others, study findings suggest using abstract descriptions in order to introduce novices to process modeling notations. With the obtained insights, we highlight that process models can be properly comprehended by novices.

6.4SEFeb 20, 2021

Open-Ended Automatic Programming Through Combinatorial Evolution

Sebastian Fix, Thomas Probst, Oliver Ruggli et al.

Combinatorial evolution - the creation of new things through the combination of existing things - can be a powerful way to evolve rather than design technical objects such as electronic circuits. Intriguingly, this seems to be an ongoing and thus open-ended process creating novelty with increasing complexity. Here, we employ combinatorial evolution in software development. While current approaches such as genetic programming are efficient in solving particular problems, they all converge towards a solution and do not create anything new anymore afterwards. Combinatorial evolution of complex systems such as languages and technology are considered open-ended. Therefore, open-ended automatic programming might be possible through combinatorial evolution. We implemented a computer program simulating combinatorial evolution of code blocks stored in a database to make them available for combining. Automatic programming in the sense of algorithm-based code generation is achieved by evaluating regular expressions. We found that reserved keywords of a programming language are suitable for defining the basic code blocks at the beginning of the simulation. We also found that placeholders can be used to combine code blocks and that code complexity can be described in terms of the importance to the programming language. As in a previous combinatorial evolution simulation of electronic circuits, complexity increased from simple keywords and special characters to more complex variable declarations, class definitions, methods, and classes containing methods and variable declarations. Combinatorial evolution, therefore, seems to be a promising approach for open-ended automatic programming.

3.3CVDec 31, 2020

Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes

Ayça Takmaz, Danda Pani Paudel, Thomas Probst et al.

Monocular depth reconstruction of complex and dynamic scenes is a highly challenging problem. While for rigid scenes learning-based methods have been offering promising results even in unsupervised cases, there exists little to no literature addressing the same for dynamic and deformable scenes. In this work, we present an unsupervised monocular framework for dense depth estimation of dynamic scenes, which jointly reconstructs rigid and non-rigid parts without explicitly modelling the camera motion. Using dense correspondences, we derive a training objective that aims to opportunistically preserve pairwise distances between reconstructed 3D points. In this process, the dense depth map is learned implicitly using the as-rigid-as-possible hypothesis. Our method provides promising results, demonstrating its capability of reconstructing 3D from challenging videos of non-rigid scenes. Furthermore, the proposed method also provides unsupervised motion segmentation results as an auxiliary output.

6.6CYJul 4, 2018

Context Data Categories and Privacy Model for Mobile Data Collection Apps

Felix Beierle, Vinh Thuy Tran, Mathias Allemand et al.

Context-aware applications stemming from diverse fields like mobile health, recommender systems, and mobile commerce potentially benefit from knowing aspects of the user's personality. As filling out personality questionnaires is tedious, we propose the prediction of the user's personality from smartphone sensor and usage data. In order to collect data for researching the relationship between smartphone data and personality, we developed the Android app TYDR (Track Your Daily Routine) which tracks smartphone data and utilizes psychometric personality questionnaires. With TYDR, we track a larger variety of smartphone data than similar existing apps, including metadata on notifications, photos taken, and music played back by the user. For the development of TYDR, we introduce a general context data model consisting of four categories that focus on the user's different types of interactions with the smartphone: physical conditions and activity, device status and usage, core functions usage, and app usage. On top of this, we develop the privacy model PM-MoDaC specifically for apps related to the collection of mobile data, consisting of nine proposed privacy measures. We present the implementation of all of those measures in TYDR. Although the utilization of the user's personality based on the usage of his or her smartphone is a challenging endeavor, it seems to be a promising approach for various types of context-aware mobile applications.

5.1CYMar 18, 2018

TYDR - Track Your Daily Routine. Android App for Tracking Smartphone Sensor and Usage Data

Felix Beierle, Vinh Thuy Tran, Mathias Allemand et al.

We present the Android app TYDR (Track Your Daily Routine) which tracks smartphone sensor and usage data and utilizes standardized psychometric personality questionnaires. With the app, we aim at collecting data for researching correlations between the tracked smartphone data and the user's personality in order to predict personality from smartphone data. In this paper, we highlight our approaches in addressing the challenges in developing such an app. We optimize the tracking of sensor data by assessing the trade-off of size of data and battery consumption and granularity of the stored information. Our user interface is designed to incentivize users to install the app and fill out questionnaires. TYDR processes and visualizes the tracked sensor and usage data as well as the results of the personality questionnaires. When developing an app that will be used in psychological studies, requirements posed by ethics commissions / institutional review boards and data protection officials have to be met. We detail our approaches concerning those requirements regarding the anonymized storing of user data, informing the users about the data collection, and enabling an opt-out option. We present our process for anonymized data storing while still being able to identify individual users who successfully completed a psychological study with the app.

9.7CVSep 17, 2017

Automatic Tool Landmark Detection for Stereo Vision in Robot-Assisted Retinal Surgery

Thomas Probst, Kevis-Kokitsi Maninis, Ajad Chhatkuli et al.

Computer vision and robotics are being increasingly applied in medical interventions. Especially in interventions where extreme precision is required they could make a difference. One such application is robot-assisted retinal microsurgery. In recent works, such interventions are conducted under a stereo-microscope, and with a robot-controlled surgical tool. The complementarity of computer vision and robotics has however not yet been fully exploited. In order to improve the robot control we are interested in 3D reconstruction of the anatomy and in automatic tool localization using a stereo microscope. In this paper, we solve this problem for the first time using a single pipeline, starting from uncalibrated cameras to reach metric 3D reconstruction and registration, in retinal microsurgery. The key ingredients of our method are: (a) surgical tool landmark detection, and (b) 3D reconstruction with the stereo microscope, using the detected landmarks. To address the former, we propose a novel deep learning method that detects and recognizes keypoints in high definition images at higher than real-time speed. We use the detected 2D keypoints along with their corresponding 3D coordinates obtained from the robot sensors to calibrate the stereo microscope using an affine projection model. We design an online 3D reconstruction pipeline that makes use of smoothness constraints and performs robot-to-camera registration. The entire pipeline is extensively validated on open-sky porcine eye sequences. Quantitative and qualitative results are presented for all steps.