Analyzing temporal graphs of malware distribution networks

Array(2022)

引用 1|浏览1
暂无评分
摘要
This research provides temporal insight on network topological structures, as well as transitional properties and malware attribution of malware distribution networks. This is accomplished with a temporal-based data set created using a novel data fusion of publicly-available data sources. We developed and used a crawler, along with public APIs, to collect publicly-available data of malicious top-level domains and relevant hosted malware from Google Safe Browsing and VirusTotal for an eight-month period between 19 January and 25 September of 2017. We then combined these data sources, which revealed new insight on fully qualified domain name topological network structure properties and temporal transitions not appreciable from the individual data sources. We have provided the technical details of our novel data fusion approach to GSB and VT static data. The result of this data fusion was the creation of new observable knowledge, primarily temporal-based structural changes and malware attribution within the distribution network, which is not available by analyzing the static data of GSB and VT in isolation. Data revealing details of malware-hosting on a domain brought to light topological structures of fully-qualified domain names involved in the distribution of malicious files. Our insights include: 1) malware distribution networks form clusters that follow the Power law, 2) network structure components such as bridges and hubs (both concepts presented and defined in this paper), and URL shortening providers serve significant roles in malware distribution dynamics, 3) persistence of fully-qualified domain names in malware distribution is random and often used only once to host malware, 4) a large number of unique, downloaded malicious files hosted on various nodes in a malware distribution network were found to belong to a much smaller set of malware families. These observed insights revealed continued persistence of surrounding topological structures. These topological structures were streaming malicious data flows to fully-qualified domain names identified as actively hosting malware before and after the date of identification. The insights further suggest large topological structures with data flow distributing malware persist over time with small sub-structural changes. We have provided suggestions on preventing sustained malicious data flows based on our temporal observations of bridge and hub structures. Individual persistent fully-qualified domain names within these large structures repeatedly served as either a source or an intermediate node of malicious data flows. This implies that the continued monitoring of data flows can serve to alert early-stage malware distribution.
更多
查看译文
关键词
Malware,Malware distribution networks,Malware attribution,Malware detection,Invasive software
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要