Scalably Detecting Third-Party Android Libraries With Two-Stage Bloom Filtering

Jianjun Huang, Bo Xue,Jiasheng Jiang,Wei You,Bin Liang,Jingzheng Wu,Yanjun Wu

IEEE Transactions on Software Engineering（2023）

引用 0|浏览28

暂无评分

摘要

Third-party library (TPL) detection is important for Android app security analysis nowadays. Unfortunately, the existing techniques often suffer from poor scalability. In some situations, the detection time cost is even unacceptable. Although a few existing methods run relatively fast, they cannot provide enough effectiveness, especially for non-structure-preserving obfuscated apps, e.g., repackaged and flattened. In this paper, we treat TPLs detection as a set inclusion problem to effectively and efficiently analyze obfuscated apps, and develop a scalable two-stage detection approach, Libloom. Specifically, the package and class signatures are encoded into two levels of Bloom filters respectively. At the first stage, the package filters are used to identify a limited number of candidate TPLs via set overlapping measurement to avoid unnecessary class-level set analysis. Subsequently, with the class filters, a similarity score is computed between the query app and each candidate to detect the integrated TPLs, and a novel entropy-based metric is presented to specially handle the repackaged and flattened apps. We have evaluated Libloom on some large-scale benchmarks involving tens of thousands of TPL instances. The experiment results demonstrate that Libloom outperforms state-of-the-art tools in both effectiveness and efficiency. Especially, the proposed two-stage method can run about ten times faster than the straightforward class-level analysis on flattened apps, and without loss of accuracy.

查看译文

关键词

Codes,Libraries,Scalability,Task analysis,Security,Benchmark testing,Measurement,Third-party library,non-structure-preserving obfuscation,set inclusion,bloom filter,entropy

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要