DAISY: Dynamic-Analysis-Induced Source Discovery for Sensitive Data

Xueling Zhang,John Heaps,Rocky Slavin,Jianwei Niu,Travis D. Breaux,Xiaoyin Wang

ACM Transactions on Software Engineering and Methodology（2022）

引用 1|浏览15

暂无评分

摘要

Mobile apps are widely used and often process users’ sensitive data. Many taint analysis tools have been applied to analyze sensitive information flows and report data leaks in apps. These tools require a list of sources (where sensitive data is accessed) as input, and researchers have constructed such lists within the Android platform by identifying Android API methods that allow access to sensitive data. However, app developers may also define methods or use third-party library’s methods for accessing data. It is difficult to collect such source methods because they are unique to the apps, and there are a large number of third-party libraries available on the market that evolve over time. To address this problem, we propose DAISY, a Dynamic-Analysis-Induced Source discoverY approach for identifying methods that return sensitive information from apps and third-party libraries. Trained on an automatically labeled data set of methods and their calling context, DAISY identifies sensitive methods in unseen apps. We evaluated DAISY on real-world apps and the results show that DAISY can achieve an overall precision of 77.9% when reporting the most confident results. Most of the identified sources and leaks cannot be detected by existing technologies.

查看译文

关键词

Privacy leak, mobile application, natural language processing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要