SECRMar 12, 2019

BenchPress: Analyzing Android App Vulnerability Benchmark Suites

arXiv:1903.05170v4
Originality Synthesis-oriented
AI Analysis

This work addresses the need for informed benchmark selection in Android security analysis, though it is incremental as it builds on existing suites without proposing new methods.

The paper tackled the problem of selecting Android vulnerability benchmark suites for evaluating security tools by empirically analyzing four suites (DroidBench, Ghera, IccBench, UBCBench) based on API usage in 227K real-world apps, finding coverage and gaps to inform tool developers and benchmark creators.

In recent years, various benchmark suites have been developed to evaluate the efficacy of Android security analysis tools. The choice of such benchmark suites used in tool evaluations is often based on the availability and popularity of suites and not on their characteristics and relevance. One of the reasons for such choices is the lack of information about the characteristics and relevance of benchmarks suites. In this context, we empirically evaluated four Android specific benchmark suites: DroidBench, Ghera, IccBench, and UBCBench. For each benchmark suite, we identified the APIs used by the suite that were discussed on Stack Overflow in the context of Android app development and measured the usage of these APIs in a sample of 227K real world apps (coverage). We also compared each pair of benchmark suites to identify the differences between them in terms of API usage. Finally, we identified security-related APIs used in real-world apps but not in any of the above benchmark suites to assess the opportunities to extend benchmark suites (gaps). The findings in this paper can help 1) Android security analysis tool developers choose benchmark suites that are best suited to evaluate their tools (informed by coverage and pairwise comparison) and 2) Android app vulnerability benchmark creators develop and extend benchmark suites (informed by gaps).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes