Supervised ranking approach to identify infLuential websites in the darknet

APPLIED INTELLIGENCE(2023)

引用 0|浏览10
暂无评分
摘要
The anonymity and high security of the Tor network allow it to host a significant amount of criminal activities. Some Tor domains attract more traffic than others, as they offer better products or services to their customers. Detecting the most influential domains in Tor can help detect serious criminal activities. Therefore, in this paper, we present a novel supervised ranking framework for detecting the most influential domains. Our approach represents each domain with 40 features extracted from five sources: text, named entities, HTML markup, network topology, and visual content to train the learning-to-rank (LtR) scheme to sort the domains based on user-defined criteria. We experimented on a subset of 290 manually ranked drug-related websites from Tor and obtained the following results. First, among the explored LtR schemes, the listwise approach outperforms the benchmarked methods with an NDCG of 0.93 for the top-10 ranked domains. Second, we quantitatively proved that our framework surpasses the link-based ranking techniques. Third, we observed that using the user-visible text feature can obtain comparable performance to all the features with a decrease of 0.02 at NDCG@5. The proposed framework might support law enforcement agencies in detecting the most influential domains related to possible suspicious activities.
更多
查看译文
关键词
Supervised learning,Learning-to-rank,Influence detection,Feature extraction,Darknet,Tor Hidden services
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要