SEAug 27, 2021
An Experimental Analysis of Graph-Distance Algorithms for Comparing API UsagesSebastian Nielebock, Paul Blockhaus, Jacob Krüger et al.
Modern software development heavily relies on the reuse of functionalities through Application Programming Interfaces (APIs). However, client developers can have issues identifying the correct usage of a certain API, causing misuses accompanied by software crashes or usability bugs. Therefore, researchers have aimed at identifying API misuses automatically by comparing client code usages to correct API usages. Some techniques rely on certain API-specific graph-based data structures to improve the abstract representation of API usages. Such techniques need to compare graphs, for instance, by computing distance metrics based on the minimal graph edit distance or the largest common subgraphs, whose computations are known to be NP-hard problems. Fortunately, there exist many abstractions for simplifying graph distance computation. However, their applicability for comparing graph representations of API usages has not been analyzed. In this paper, we provide a comparison of different distance algorithms of API-usage graphs regarding correctness and runtime. Particularly, correctness relates to the algorithms' ability to identify similar correct API usages, but also to discriminate similar correct and false usages as well as non-similar usages. For this purpose, we systematically identified a set of eight graph-based distance algorithms and applied them on two datasets of real-world API usages and misuses. Interestingly, our results suggest that existing distance algorithms are not reliable for comparing API usage graphs. To improve on this situation, we identified and discuss the algorithms' issues, based on which we formulate hypotheses to initiate research on overcoming them.
SEMar 17, 2021
AndroidCompass: A Dataset of Android Compatibility Checks in Code RepositoriesSebastian Nielebock, Paul Blockhaus, Jacob Krüger et al.
Many developers and organizations implement apps for Android, the most widely used operating system for mobile devices. Common problems developers face are the various hardware devices, customized Android variants, and frequent updates, forcing them to implement workarounds for the different versions and variants of Android APIs used in practice. In this paper, we contribute the Android Compatibility checkS dataSet (AndroidCompass) that comprises changes to compatibility checks developers use to enforce workarounds for specific Android versions in their apps. We extracted 80,324 changes to compatibility checks from 1,394 apps by analyzing the version histories of 2,399 projects from the F-Droid catalog. With AndroidCompass, we aim to provide data on when and how developers introduced or evolved workarounds to handle Android incompatibilities. We hope that AndroidCompass fosters research to deal with version incompatibilities, address potential design flaws, identify security concerns, and help derive solutions for other developers, among others-helping researchers to develop and evaluate novel techniques, and Android app as well as operating-system developers in engineering their software.
SEAug 1, 2020
Guided Pattern Mining for API Misuse Detection by Change-Based Code AnalysisSebastian Nielebock, Robert Heumüller, Kevin Michael Schott et al.
Lack of experience, inadequate documentation, and sub-optimal API design frequently cause developers to make mistakes when re-using third-party implementations. Such API misuses can result in unintended behavior, performance losses, or software crashes. Therefore, current research aims to automatically detect such misuses by comparing the way a developer used an API to previously inferred patterns of the correct API usage. While research has made significant progress, these techniques have not yet been adopted in practice. In part, this is due to the lack of a process capable of seamlessly integrating with software development processes. Particularly, existing approaches do not consider how to collect relevant source code samples from which to infer patterns. In fact, an inadequate collection can cause API usage pattern miners to infer irrelevant patterns which leads to false alarms instead of finding true API misuses. In this paper, we target this problem (a) by providing a method that increases the likelihood of finding relevant and true-positive patterns concerning a given set of code changes and agnostic to a concrete static, intra-procedural mining technique and (b) by introducing a concept for just-in-time API misuse detection which analyzes changes at the time of commit. Particularly, we introduce different, lightweight code search and filtering strategies and evaluate them on two real-world API misuse datasets to determine their usefulness in finding relevant intra-procedural API usage patterns. Our main results are (1) commit-based search with subsequent filtering effectively decreases the amount of code to be analyzed, (2) in particular method-level filtering is superior to file-level filtering, (3) project-internal and project-external code search find solutions for different types of misuses and thus are complementary, (4) incorporating prior knowledge of the misused [...]