Reinforcement learning based web crawler detection for diversity and dynamics

Neurocomputing(2023)

引用 3|浏览66
暂无评分
摘要
Crawler detection is always an important research topic in network security. With the development of web technology, crawlers are constantly updating and changing, and their types are becoming diverse. The diversity and dynamics of crawlers pose significant challenges for feature applicability and model robustness. Existing crawler detection methods can only detect a limited number of crawlers by predefined rules and can not cover all types of crawlers; worse, they can be completely invalidated by the emergence of new types of crawlers. In this paper, we propose a reinforcement learning based web crawler detection method for diversity and dynamics (WC3D), which is composed of a feature selector and a session classifier. The feature selector selects the appropriate feature set for different types of crawlers with deep deterministic policy gradient. The session classifier makes crawler detection and provides rewards to the feature selector. The two modules are trained jointly to optimize the feature selection and session classification processes. Extensive experiments demonstrate the existence of crawler diversity and that the proposed method is still highly robust against the new type of crawlers and achieves state-of-the-art performance even without considering the dynamics of the crawlers.
更多
查看译文
关键词
Web crawler detection,Reinforcement learning,Feature selection,Crawler diversity,Crawler dynamics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要