Michael Schlichtig

2.7SEApr 24

Challenges in Android Data Disclosure: An Empirical Study

Mugdha Khedkar, Michael Schlichtig, Mohamed Soliman et al.

Current legal frameworks enforce that Android developers accurately report the data their apps collect. However, large codebases can make this reporting challenging. This paper employs an empirical approach to understand developers' experience with Google Play Store's Data Safety Section (DSS) form. We first survey 41 Android developers to understand how they categorize privacy-related data into DSS categories and how confident they feel when completing the DSS form. To gain a broader and more detailed view of the challenges developers encounter during the process, we complement the survey with an analysis of 172 online developer discussions, capturing the perspectives of 642 additional developers. Together, these two data sources represent insights from 683 developers. Our findings reveal that developers often manually classify the privacy-related data their apps collect into the data categories defined by Google-or, in some cases, omit classification entirely-and rely heavily on existing online resources when completing the form. Moreover, developers are generally confident in recognizing the data their apps collect, yet they lack confidence in translating this knowledge into DSS-compliant disclosures. Key challenges include issues in identifying privacy-relevant data to complete the form, limited understanding of the form, and concerns about app rejection due to discrepancies with Google's privacy requirements. These results underscore the need for clearer guidance and more accessible tooling to support developers in meeting privacy-aware reporting obligations.

0.6SEMar 11

FP-Predictor - False Positive Prediction for Static Analysis Reports

Tom Ohlmer, Michael Schlichtig, Eric Bodden

Static Application Security Testing (SAST) tools play a vital role in modern software development by automatically detecting potential vulnerabilities in source code. However, their effectiveness is often limited by a high rate of false positives, which wastes developer's effort and undermines trust in automated analysis. This work presents a Graph Convolutional Network (GCN) model designed to predict SAST reports as true and false positive. The model leverages Code Property Graphs (CPGs) constructed from static analysis results to capture both, structural and semantic relationships within code. Trained on the CamBenchCAP dataset, the model achieved an accuracy of 100% on the test set using an 80/20 train-test split. Evaluation on the CryptoAPI-Bench benchmark further demonstrated the model's practical applicability, reaching an overall accuracy of up to 96.6%. A detailed qualitative inspection revealed that many cases marked as misclassifications corresponded to genuine security weaknesses, indicating that the model effectively reflects conservative, security-aware reasoning. Identified limitations include incomplete control-flow representation due to missing interprocedural connections. Future work will focus on integrating call graphs, applying graph explainability techniques, and extending training data across multiple SAST tools to improve generalization and interpretability.

Michael Schlichtig

2 Papers