Thomas Groß

h-index31

15papers

63citations

Novelty36%

AI Score27

Ranked #157,360 of 194,257 authors (top 81%)#4,320 in CR (top 64%)

15 Papers

3.0SEMar 30, 2020Code

Is it feasible to detect FLOSS version release events from textual messages? A case study on Stack Overflow

A. Sokolovsky, T. Gross, J. Bacardit

Topic Detection and Tracking (TDT) is a very active research question within the area of text mining, generally applied to news feeds and Twitter datasets, where topics and events are detected. The notion of "event" is broad, but typically it applies to occurrences that can be detected from a single post or a message. Little attention has been drawn to what we call "micro-events", which, due to their nature, cannot be detected from a single piece of textual information. The study investigates the feasibility of micro-event detection on textual data using a sample of messages from the Stack Overflow Q&A platform and Free/Libre Open Source Software (FLOSS) version releases from Libraries.io dataset. We build pipelines for detection of micro-events using three different estimators whose parameters are optimized using a grid search approach. We consider two feature spaces: LDA topic modeling with sentiment analysis, and hSBM topics with sentiment analysis. The feature spaces are optimized using the recursive feature elimination with cross validation (RFECV) strategy. In our experiments we investigate whether there is a characteristic change in the topics distribution or sentiment features before or after micro-events take place and we thoroughly evaluate the capacity of each variant of our analysis pipeline to detect micro-events. Additionally, we perform a detailed statistical analysis of the models, including influential cases, variance inflation factors, validation of the linearity assumption, pseudo R squared measures and no-information rate. Finally, in order to study limits of micro-event detection, we design a method for generating micro-event synthetic datasets with similar properties to the real-world data, and use them to identify the micro-event detectability threshold for each of the evaluated classifiers.

3.3ETFeb 5, 2025

Implementing Large Quantum Boltzmann Machines as Generative AI Models for Dataset Balancing

Salvatore Sinno, Markus Bertl, Arati Sahoo et al.

This study explores the implementation of large Quantum Restricted Boltzmann Machines (QRBMs), a key advancement in Quantum Machine Learning (QML), as generative models on D-Wave's Pegasus quantum hardware to address dataset imbalance in Intrusion Detection Systems (IDS). By leveraging Pegasus's enhanced connectivity and computational capabilities, a QRBM with 120 visible and 120 hidden units was successfully embedded, surpassing the limitations of default embedding tools. The QRBM synthesized over 1.6 million attack samples, achieving a balanced dataset of over 4.2 million records. Comparative evaluations with traditional balancing methods, such as SMOTE and RandomOversampler, revealed that QRBMs produced higher-quality synthetic samples, significantly improving detection rates, precision, recall, and F1 score across diverse classifiers. The study underscores the scalability and efficiency of QRBMs, completing balancing tasks in milliseconds. These findings highlight the transformative potential of QML and QRBMs as next-generation tools in data preprocessing, offering robust solutions for complex computational challenges in modern information systems.

3.8CRSep 22, 2021

Why Most Results of Socio-Technical Security User Studies Are False

Thomas Gross

Background. In recent years, cyber security user studies have been scrutinized for their reporting completeness, statistical reporting fidelity, statistical reliability and biases. It remains an open question what strength of evidence positive reports of such studies actually yield. We focus on the extent to which positive reports indicate relation true in reality, that is, a probabilistic assessment. Aim. This study aims at establishing the overall strength of evidence in cyber security user studies, with the dimensions -- Positive Predictive Value (PPV) and its complement False Positive Risk (FPR), -- Likelihood Ratio (LR), and -- Reverse-Bayesian Prior (RBP) for a fixed tolerated False Positive Risk. Method. Based on $431$ coded statistical inferences in $146$ cyber security user studies from a published SLR covering the years 2006-2016, we first compute a simulation of the a posteriori false positive risk based on assumed prior and bias thresholds. Second, we establish the observed likelihood ratios for positive reports. Third, we compute the reverse Bayesian argument on the observed positive reports by computing the prior required for a fixed a posteriori false positive rate. Results. We obtain a comprehensive analysis of the strength of evidence including an account of appropriate multiple comparison corrections. The simulations show that even in face of well-controlled conditions and high prior likelihoods, only few studies achieve good a posteriori probabilities. Conclusions. Our work shows that the strength of evidence of the field is weak and that most positive reports are likely false. From this, we learn what to watch out for in studies to advance the knowledge of the field.

1.6LGMar 23, 2021

Volume-Centred Range Bars: Novel Interpretable Representation of Financial Markets Designed for Machine Learning Applications

Artur Sokolovsky, Luca Arnaboldi, Jaume Bacardit et al.

Financial markets are a source of non-stationary multidimensional time series which has been drawing attention for decades. Each financial instrument has its specific changing-over-time properties, making its analysis a complex task. Hence, improvement of understanding and development of more informative, generalisable market representations are essential for the successful operation in financial markets, including risk assessment, diversification, trading, and order execution. In this study, we propose a volume-price-based market representation for making financial time series more suitable for machine learning pipelines. We use a statistical approach for evaluating the representation. Through the research questions, we investigate, i) whether the proposed representation allows the more efficient design of machine learning models; ii) whether the proposed representation leads to increased performance over the price levels market pattern; iii) whether the proposed representation performs better on the liquid markets, and iv) whether SHAP feature interactions are reliable to be used in the considered setting. Our analysis shows that the proposed volume-based method allows successful classification of the financial time series patterns, and also leads to better classification performance than the price levels-based method, excelling specifically on more liquid financial instruments. Finally, we propose an approach for obtaining feature interactions directly from tree-based models and compare the outcomes to those of the SHAP method. This results in the significant similarity between the two methods, hence we claim that SHAP feature interactions are reliable to be used in the setting of financial markets.

5.8HCNov 23, 2020

Validity and Reliability of the Scale Internet Users' Information Privacy Concern (IUIPC) [Extended Version]

Thomas Groß

Internet Users' Information Privacy Concerns (IUIPC-10) is one of the most endorsed privacy concern scales. It is widely used in the evaluation of human factors of PETs and the investigation of the privacy paradox. Even though its predecessor Concern For Information Privacy (CFIP) has been evaluated independently and the instrument itself seen some scrutiny, we are still missing a dedicated confirmation of IUIPC-10, itself. We aim at closing this gap by systematically analyzing IUIPC's construct validity and reliability. We obtained three mutually independent samples with a total of $N = 1031$ participants. We conducted a confirmatory factor analysis (CFA) on our main sample. Having found weaknesses, we established further factor analyses to assert the dimensionality of IUIPC-10. We proposed a respecified instrument IUIPC-8 with improved psychometric properties. Finally, we validated our findings on a validation sample. While we could confirm the overall three-dimensionality of IUIPC-10, we found that IUIPC-10 consistently failed construct validity and reliability evaluations, calling into question the unidimensionality of its sub-scales Awareness and Control. Our respecified scale IUIPC-8 offers a statistically significantly better model and outperforms IUIPC-10's construct validity and reliability. The disconfirming evidence on the construct validity raises doubts how well IUIPC-10 measures the latent variable information privacy concern. The sub-par reliability could yield spurious and erratic results as well as attenuate relations with other latent variables, such as behavior. Thereby, the instrument could confound studies of human factors of PETs or the privacy paradox, in general.

5.8HCOct 5, 2020

Statistical Reliability of 10 Years of Cyber Security User Studies (Extended Version)

Thomas Groß

Background. In recent years, cyber security security user studies have been appraised in meta-research, mostly focusing on the completeness of their statistical inferences and the fidelity of their statistical reporting. However, estimates of the field's distribution of statistical power and its publication bias have not received much attention. Aim. In this study, we aim to estimate the effect sizes and their standard errors present as well as the implications on statistical power and publication bias. Method. We built upon a published systematic literature review of $146$ user studies in cyber security (2006--2016). We took into account $431$ statistical inferences including $t$-, $χ^2$-, $r$-, one-way $F$-tests, and $Z$-tests. In addition, we coded the corresponding total sample sizes, group sizes and test families. Given these data, we established the observed effect sizes and evaluated the overall publication bias. We further computed the statistical power vis-{à}-vis of parametrized population thresholds to gain unbiased estimates of the power distribution. Results. We obtained a distribution of effect sizes and their conversion into comparable log odds ratios together with their standard errors. We, further, gained funnel-plot estimates of the publication bias present in the sample as well as insights into the power distribution and its consequences. Conclusions. Through the lenses of power and publication bias, we shed light on the statistical reliability of the studies in the field. The upshot of this introspection is practical recommendations on conducting and evaluating studies to advance the field.

2.9CRSep 25, 2020

Investigation of 3-D Secure's Model for Fraud Detection

Mohammed Aamir Ali, Thomas Groß, Aad van Moorsel

Background. 3-D Secure 2.0 (3DS 2.0) is an identity federation protocol authenticating the payment initiator for credit card transactions on the Web. Aim. We aim to quantify the impact of factors used by 3DS 2.0 in its fraud-detection decision making process. Method. We ran credit card transactions with two Web sites systematically manipulating the nominal IVs \textsf{machine\_data}, \textsf{value}, \textsf{region}, and \textsf{website}. We measured whether the user was \textsf{challenged} with an authentication, whether the transaction was \textsf{declined}, and whether the card was \textsf{blocked} as nominal DVs. Results. While \textsf{website} and \textsf{card} largely did not show a significant impact on any outcome, \textsf{machine\_data}, \textsf{value} and \textsf{region} did. A change in \textsf{machine\_data}, \textsf{region} or \textsf{value} made it 5-7 times as likely to be challenged with password authentication. However, even in a foreign region with another factor being changed, the overall likelihood of being challenged only reached $60\%$. When in the card's home region, a transaction will be rarely declined ($< 5\%$ in control, $40\%$ with one factor changed). However, in a region foreign to the card the system will more likely decline transactions anyway (about $60\%$) and any change in \textsf{machine\_data} or \textsf{value} will lead to a near-certain declined transaction. The \textsf{region} was the only significant predictor for a card being blocked ($\mathsf{OR}=3$). Conclusions. We found that the decisions to challenge the user with a password authentication, to decline a transaction and to block a card are governed by different weightings. 3DS 2.0 is most likely to decline transactions, especially in a foreign region. It is less likely to challenge users with password authentication, even if \textsf{machine\_data} or \textsf{value} are changed.

5.8HCSep 25, 2020

Investigation of the Effect of Fear and Stress on Password Choice (Extended Version)

Tom Fordyce, Sam Green, Thomas Groß

Background. The current cognitive state, such as cognitive effort and depletion, incidental affect or stress may impact the strength of a chosen password unconsciously. Aim. We investigate the effect of incidental fear and stress on the measured strength of a chosen password. Method. We conducted two experiments with within-subject designs measuring the Zxcvbn \textsf{log10} number of guesses as strength of chosen passwords as dependent variable. In both experiments, participants were signed up to a site holding their personal data and, for the second run a day later, asked under a security incident pretext to change their password. (a) Fear. $N_\mathsf{F} = 34$ participants were exposed to standardized fear and happiness stimulus videos in random order. (b) \textbf{Stress.} $N_\mathsf{S} = 50$ participants were either exposed to a battery of standard stress tasks or left in a control condition in random order. The Zxcvbn password strength was compared across conditions. Results. We did not observe a statistically significant difference in mean Zxcvbn password strengths on fear (Hedges' $g_{\mathsf{av}} = -0.11$, 95\% CI $[-0.45, 0.23]$) or stress (and control group, Hedges' $g_{\mathsf{av}} = 0.01$, 95\% CI $[-0.31, 0.33]$). However, we found a statistically significant cross-over interaction of stress and TLX mental demand. Conclusions. While having observed negligible main effect size estimates for incidental fear and stress, we offer evidence towards the interaction between stress and cognitive effort that vouches for further investigation.

3.3HCJul 16, 2020

Investigation of the Effect of Incidental Fear Privacy Behavioral Intention (Technical Report)

Uchechi Phyllis Nwadike, Thomas Groß

Background. Incidental emotions users feel during their online activities may alter their privacy behavioral intentions. Aim. We investigate the effect of incidental affect (fear and happiness) on privacy behavioral intention. Method. We recruited $330$ participants for a within-subjects experiment in three random-controlled user studies. The participants were exposed to three conditions \textsf{neutral}, \textsf{fear}, \textsf{happiness} with standardised stimuli videos for incidental affect induction. Fear and happiness were assigned in random order. The participants' privacy behavioural intentions (PBI) were measured followed by a Positive and Negative Affect Schedule (PANAS-X) manipulation check on self-reported affect. The PBI and PANAS-X were compared across treatment conditions. Results. We observed a statistically significant difference in PBI and Protection Intention in neutral-fear and neutral-happy comparisons. However across fear and happy conditions, we did not observe any statistically significant change in PBI scores. Conclusions. We offer the first systematic analysis of the impact of incidental affects on Privacy Behavioral Intention (PBI) and its sub-constructs. We are the first to offer a fine-grained analysis of neutral-affect comparisons and interactions offering insights in hitherto unexplained phenomena reported in the field.

7.2CRMay 26, 2020

A Survey on Hardware Approaches for Remote Attestation in Network Infrastructures

Ioannis Sfyrakis, Thomas Gross

Remote attestation schemes have been utilized for assuring the integrity of a network node to a remote verifier. In recent years, a number of remote attestation schemes have been proposed for various contexts such as cloud computing, Internet of Things (IoTs) and critical network infrastructures. These attestation schemes provide a different perspective in terms of security objectives, scalability and efficiency. In this report, we focus on remote attestation schemes that use a hardware device and cryptographic primitives to assist with the attestation of nodes in a network infrastructure. We also point towards the open research challenges that await the research community and propose possible avenues of addressing these challenges.

5.2CRMay 26, 2020

GSL: A Cryptographic Library for the strong RSA Graph Signature Scheme

Ioannis Sfyrakis, Thomas Gross

Current cloud and network infrastructures do not employ privacy-preserving methods to protect their assets. Anonymous credential schemes are a cryptographic building block that enables the certification of data structures and prove properties over their representations without disclosing the innards of their data structures in zero-knowledge. The GRaph Signature (GRS) scheme enables the certification and proof methods to sign infrastructure topologies represented as graph data structures and use zero-knowledge to prove properties over their certificates. As such, they represent a powerful privacy-preserving method that proves properties over a signed topology graph to another party without disclosing the blueprint of its topology. In this paper, we report our efforts in designing, implementing and benchmarking a Graph Signature Library (GSL). GSL is a cryptographic library realized in Java that implements the graph signature scheme.

7.2CRApr 14, 2020

Fidelity of Statistical Reporting in 10 Years of Cyber Security User Studies

Thomas Groß

Studies in socio-technical aspects of security often rely on user studies and statistical inferences on investigated relations to make their case. They, thereby, enable practitioners and scientists alike to judge on the validity and reliability of the research undertaken. To ascertain this capacity, we investigated the reporting fidelity of security user studies. Based on a systematic literature review of $114$ user studies in cyber security from selected venues in the 10 years 2006--2016, we evaluated fidelity of the reporting of $1775$ statistical inferences using the \textsf{R} package \textsf{statcheck}. We conducted a systematic classification of incomplete reporting, reporting inconsistencies and decision errors, leading to multinomial logistic regression (MLR) on the impact of publication venue/year as well as a comparison to a compatible field of psychology. We found that half the cyber security user studies considered reported incomplete results, in stark difference to comparable results in a field of psychology. Our MLR on analysis outcomes yielded a slight increase of likelihood of incomplete tests over time, while SOUPS yielded a few percent greater likelihood to report statistics correctly than other venues. In this study, we offer the first fully quantitative analysis of the state-of-play of socio-technical studies in security. While we highlight the impact and prevalence of incomplete reporting, we also offer fine-grained diagnostics and recommendations on how to respond to the situation.

6.8CRJan 17, 2019Code

Easy to Fool? Testing the Anti-evasion Capabilities of PDF Malware Scanners

Saeed Ehteshamifar, Antonio Barresi, Thomas R. Gross et al.

Malware scanners try to protect users from opening malicious documents by statically or dynamically analyzing documents. However, malware developers may apply evasions that conceal the maliciousness of a document. Given the variety of existing evasions, systematically assessing the impact of evasions on malware scanners remains an open challenge. This paper presents a novel methodology for testing the capability of malware scanners to cope with evasions. We apply the methodology to malicious Portable Document Format (PDF) documents and present an in-depth study of how current PDF evasions affect 41 state-of-the-art malware scanners. The study is based on a framework for creating malicious PDF documents that use one or more evasions. Based on such documents, we measure how effective different evasions are at concealing the maliciousness of a document. We find that many static and dynamic scanners can be easily fooled by relatively simple evasions and that the effectiveness of different evasions varies drastically. Our work not only is a call to arms for improving current malware scanners, but by providing a large-scale corpus of malicious PDF documents with evasions, we directly support the development of improved tools to detect document-based malware. Moreover, our methodology paves the way for a quantitative evaluation of evasions in other kinds of malware.

3.2CRJun 19, 2015

Towards a New Paradigm for Privacy and Security in Cloud Services

Thomas Loruenser, Charles Bastos Rodriguez, Denise Demirel et al.

The market for cloud computing can be considered as the major growth area in ICT. However, big companies and public authorities are reluctant to entrust their most sensitive data to external parties for storage and processing. The reason for their hesitation is clear: There exist no satisfactory approaches to adequately protect the data during its lifetime in the cloud. The EU Project Prismacloud (Horizon 2020 programme; duration 2/2015-7/2018) addresses these challenges and yields a portfolio of novel technologies to build security enabled cloud services, guaranteeing the required security with the strongest notion possible, namely by means of cryptography. We present a new approach towards a next generation of security and privacy enabled services to be deployed in only partially trusted cloud infrastructures.

3.7CRJul 2, 2014

Lockdown: Dynamic Control-Flow Integrity

Mathias Payer, Antonio Barresi, Thomas R. Gross

Applications written in low-level languages without type or memory safety are especially prone to memory corruption. Attackers gain code execution capabilities through such applications despite all currently deployed defenses by exploiting memory corruption vulnerabilities. Control-Flow Integrity (CFI) is a promising defense mechanism that restricts open control-flow transfers to a static set of well-known locations. We present Lockdown, an approach to dynamic CFI that protects legacy, binary-only executables and libraries. Lockdown adaptively learns the control-flow graph of a running process using information from a trusted dynamic loader. The sandbox component of Lockdown restricts interactions between different shared objects to imported and exported functions by enforcing fine-grained CFI checks. Our prototype implementation shows that dynamic CFI results in low performance overhead.