Yida Tao

2papers

2 Papers

67.2SEApr 16Code
Towards Understanding Android APIs: Official Lists, Vendor Customizations, and Real-World Usage

Sinan Wang, Qi Zhang, Jiacheng Li et al.

Android apps are built on APIs that abstract core Android system functionalities. These APIs are officially documented in multiple files distributed with the Android source code or SDK, which we collectively refer to as Android API Lists (AALs). Prior Android research has relied on specific AALs, often treating them as interchangeable ground truth. However, recent studies suggest that different AALs can lead to substantially different research outcomes, raising concerns about the validity and reproducibility of Android API-based analyses. To address this issue, we present the first in-depth empirical study of four official AALs that are widely used in prior work. We systematically characterize their contents and analyze their evolution across Android releases. We then perform a fine-grained comparison of the APIs recorded in each AAL to uncover their underlying API inclusion policies and inconsistencies. To assess the practical impact of these differences, we further examine API availability on nine Android devices, including both stock Android and vendor-customized systems. Finally, we analyze API usage in 17,759 real-world Android apps (including open-source apps, commercial apps, and malware) to quantify how the choice of AAL affects empirical Android research. Our results reveal that official AALs are neither stable nor mutually consistent, and that discrepancies among them can substantially influence research conclusions. We also observe that vendor-customized APIs are actively used by normal apps, yet remain largely overlooked by existing studies. Based on these findings, we discuss their implications for Android API-based research and provide actionable suggestions to help researchers select and interpret AALs more reliably.

CVJul 29, 2025
Bridging Synthetic and Real-World Domains: A Human-in-the-Loop Weakly-Supervised Framework for Industrial Toxic Emission Segmentation

Yida Tao, Yen-Chia Hsu

Industrial smoke segmentation is critical for air-quality monitoring and environmental protection but is often hampered by the high cost and scarcity of pixel-level annotations in real-world settings. We introduce CEDANet, a human-in-the-loop, class-aware domain adaptation framework that uniquely integrates weak, citizen-provided video-level labels with adversarial feature alignment. Specifically, we refine pseudo-labels generated by a source-trained segmentation model using citizen votes, and employ class-specific domain discriminators to transfer rich source-domain representations to the industrial domain. Comprehensive experiments on SMOKE5K and custom IJmond datasets demonstrate that CEDANet achieves an F1-score of 0.414 and a smoke-class IoU of 0.261 with citizen feedback, vastly outperforming the baseline model, which scored 0.083 and 0.043 respectively. This represents a five-fold increase in F1-score and a six-fold increase in smoke-class IoU. Notably, CEDANet with citizen-constrained pseudo-labels achieves performance comparable to the same architecture trained on limited 100 fully annotated images with F1-score of 0.418 and IoU of 0.264, demonstrating its ability to reach small-sampled fully supervised-level accuracy without target-domain annotations. Our research validates the scalability and cost-efficiency of combining citizen science with weakly supervised domain adaptation, offering a practical solution for complex, data-scarce environmental monitoring applications.