TrafficGPT: Breaking the Token Barrier for Efficient Long Traffic Analysis and Generation
arxiv(2024)
摘要
Over the years, network traffic analysis and generation have advanced
significantly. From traditional statistical methods, the field has progressed
to sophisticated deep learning techniques. This progress has improved the
ability to detect complex patterns and security threats, as well as to test and
optimize network performance. However, obstacles persist, such as the
dependence on labeled data for analysis and the difficulty of generating
traffic samples that follow realistic patterns. Pre-trained deep neural
networks have emerged as powerful tools to resolve these issues, offering
improved performance by learning robust data representations from large
unlabeled datasets. Despite their benefits, existing pre-trained models face
challenges like token length limitation, which restricts their usefulness in
comprehensive traffic analysis and realistic traffic generation. To address
these challenges, we introduce TrafficGPT, a deep learning model that can
tackle complex challenges related to long flow classification and generation
tasks. This model uses generative pre-training with the linear attention
mechanism, which allows for a substantially increased capacity of up to 12,032
tokens from the previous limit of only 512 tokens. TrafficGPT demonstrates
superior performance in classification tasks, reaching state-of-the-art levels.
In generation tasks, it closely resembles real traffic flows, with low JS
divergence and an F1 score close to 0.5 (representing a random guess) in
discriminating generated data. These advancements hold promise for future
applications in both traffic flow classification and generation tasks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要