Julia Rubin

4papers

100citations

Novelty20%

AI Score18

Ranked #199,442 of 205,806 authors (top 97%)#2,956 in SE (top 86%)

4 Papers

LGMar 17, 2023

It Is All About Data: A Survey on the Effects of Data on Adversarial Robustness

Peiyu Xiong, Michael Tegegn, Jaskeerat Singh Sarin et al.

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to confuse the model into making a mistake. Such examples pose a serious threat to the applicability of machine-learning-based systems, especially in life- and safety-critical domains. To address this problem, the area of adversarial robustness investigates mechanisms behind adversarial attacks and defenses against these attacks. This survey reviews a particular subset of this literature that focuses on investigating properties of training data in the context of model robustness under evasion attacks. It first summarizes the main properties of data leading to adversarial vulnerability. It then discusses guidelines and techniques for improving adversarial robustness by enhancing the data representation and learning procedures, as well as techniques for estimating robustness guarantees given particular data. Finally, it discusses gaps of knowledge and promising future research directions in this area.

SEJun 15, 2021

AndroR2: A Dataset of Manually Reproduced Bug Reports for Android Applications

Tyler Wendland, Jingyang Sun, Junayed Mahmud et al.

Software maintenance constitutes a large portion of the software development lifecycle. To carry out maintenance tasks, developers often need to understand and reproduce bug reports. As such, there has been increasing research activity coalescing around the notion of automating various activities related to bug reporting. A sizable portion of this research interest has focused on the domain of mobile apps. However, as research around mobile app bug reporting progresses, there is a clear need for a manually vetted and reproducible set of real-world bug reports that can serve as a benchmark for future work. This paper presents ANDROR2: a dataset of 90 manually reproduced bug reports for Android apps listed on Google Play and hosted on GitHub, systematically collected via an in-depth analysis of 459 reports extracted from the GitHub issue tracker. For each reproduced report, ANDROR2 includes the original bug report, an apk file for the buggy version of the app, an executable reproduction script, and metadata regarding the quality of the reproduction steps associated with the original report. We believe that the ANDROR2 dataset can be used to facilitate research in automatically analyzing, understanding, reproducing, localizing, and fixing bugs for mobile applications as well as other software maintenance activities more broadly.

SEJul 14, 2019

Predicting Merge Conflicts in Collaborative Software Development

Moein Owhadi-Kareshk, Sarah Nadi, Julia Rubin

Background. During collaborative software development, developers often use branches to add features or fix bugs. When merging changes from two branches, conflicts may occur if the changes are inconsistent. Developers need to resolve these conflicts before completing the merge, which is an error-prone and time-consuming process. Early detection of merge conflicts, which warns developers about resolving conflicts before they become large and complicated, is among the ways of dealing with this problem. Existing techniques do this by continuously pulling and merging all combinations of branches in the background to notify developers as soon as a conflict occurs, which is a computationally expensive process. One potential way for reducing this cost is to use a machine-learning based conflict predictor that filters out the merge scenarios that are not likely to have conflicts, ie safe merge scenarios. Aims. In this paper, we assess if conflict prediction is feasible. Method. We design a classifier for predicting merge conflicts, based on 9 light-weight Git feature sets. To evaluate our predictor, we perform a large-scale study on 267, 657 merge scenarios from 744 GitHub repositories in seven programming languages. Results. Our results show that we achieve high f1-scores, varying from 0.95 to 0.97 for different programming languages, when predicting safe merge scenarios. The f1-score is between 0.57 and 0.68 for the conflicting merge scenarios. Conclusions. Predicting merge conflicts is feasible in practice, especially in the context of predicting safe merge scenarios as a pre-filtering step for speculative merging.

SEMar 28, 2016

Proceedings 7th International Workshop on Formal Methods and Analysis in Software Product Line Engineering

Julia Rubin, Thomas Thüm

In Software Product Line Engineering (SPLE), a portfolio of similar systems is developed from a shared set of software assets. Claimed benefits of SPLE include reductions in the portfolio size, cost of software development and time to production, as well as improvements in the quality of the delivered systems. Yet, despite these benefits, SPLE is still in the early adoption stage. We believe that automated approaches, tools and techniques that provide better support for SPLE activities can further facilitate its adoption in practice and increase its benefits. To promote work in this area, the FMSPLE'16 workshop focuses on automated analysis and formal methods, which can (1) lead to a further increase in development productivity and reduction in maintenance costs associated with management of the SPLE artifacts, and (2) provide proven guarantees for the correctness and quality of the delivered systems.