CRJan 5, 2018

Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild

Shuaike Dong, Menghao Li, Wenrui Diao, Xiangyu Liu, Jian Liu, Zhou Li, Fenghao Xu, Kai Chen, Xiaofeng Wang, Kehuan Zhang

arXiv:1801.01633v118.117 citations

Originality Synthesis-oriented

AI Analysis

This study helps developers choose obfuscation methods and researchers improve code analysis systems, but it is incremental as it applies existing detection models to new data.

The paper conducted a large-scale investigation of Android obfuscation techniques in the wild, analyzing four popular approaches across massive APK datasets, and found that malware authors use string encryption more frequently and more apps on third-party markets are packed than on Google Play.

In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on four popular obfuscation approaches: identifier renaming, string encryption, Java reflection, and packing. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, malware authors use string encryption more frequently, and more apps on third-party markets than Google Play are packed. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We believe our study will help developers select the most suitable obfuscation approach, and in the meantime help researchers improve code analysis systems in the right direction.

View on arXiv PDF

Similar