Large Scale Enrichment and Statistical Cyber Characterization of Network Traffic

2022 IEEE High Performance Extreme Computing Conference (HPEC)(2022)

引用 2|浏览8
暂无评分
摘要
Modern network sensors continuously produce enormous quantities of raw data that are beyond the capacity of human analysts. Cross-correlation of network sensors increases this challenge by enriching every network event with additional metadata. These large volumes of enriched network data present opportunities to statistically characterize network traffic and quickly answer a key question: “What are the primary cyber characteristics of my network data?” The Python GraphBLAS and PyD4M analysis frameworks enable anonymized statistical analysis to be performed quickly and efficiently on very large network data sets. This approach is tested using billions of anonymized network data samples from the largest Internet observatory (CAIDA Telescope) and tens of millions of anonymized records from the largest commercially available background enrichment capability (GreyNoise). The analysis confirms that most of the enriched variables follow expected heavy-tail distributions and that a large fraction of the network traffic is due to a small number of cyber activities. This information can simplify the cyber analysts' task by enabling prioritization of cyber activities based on statistical prevalence.
更多
查看译文
关键词
Cybersecurity,High Performing Computing,Big Data,Networks Scanning,Dimensional Analysis,Internet Modeling,Packet Capture,Streaming Graphs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要