A Fast Multi-Patterns Parallel Matching Algorithm For Massive Http Data Processing

PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC)(2017)

引用 0|浏览20
暂无评分
摘要
The development of data services in wireless mobile networks leads to the tremendous growth of net users, making user behavior grow rapidly. And it brings a great opportunity for researchers to analyze user behavior through large-scale network traffic, which is not only significant for Internet Service Providers (ISP) to optimize resource allocation, but also can provide users with more customized service. The analysis of user behavior is based on the extraction of user characteristics, and multi-patterns URL matching is the foundation. However, the efficiency of extracting user behavior from massive network traffic data is still a huge challenge problem. This paper focuses on the efficiency of extracting user characteristics and proposes a novel algorithm, Multi-Patterns Parallel Matching on HTTP Traffic (MPPM) that takes advantage of the hash map in data searching, and it can extract user behavior from massive HTTP traffic more effective and faster than conventional methods with the same accuracy. Experiments are conducted by using real-world HTTP traffic data collected from the ISP networks. It is demonstrated that the proposed algorithm is superior to the known methods, as well as the capacity of dealing with massive HTTP traffic data. The implementation of MPPM algorithm will be a solid base to build a high-performance analysis engine of user behavior for massive HTTP data processing.
更多
查看译文
关键词
HTTP traffic, URL matching, multi-patterns matching, user behavior, Spark
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要