Suspicious URL Filtering Based on Logistic Regression with Multi-view Analysis

Ke-Wei Su,Kuo-Ping Wu,Hahn-Ming Lee,Te-En Wei

Information Security（2013）

引用 10|浏览2

暂无评分

摘要

The current malicious URLs detecting techniques based on whole URL information are hard to detect the obfuscated malicious URLs. The most precise way to identify a malicious URL is verifying the corresponding web page contents. However, it costs very much in time, traffic and computing resource. Therefore, a filtering process that detecting more suspicious URLs which should be further verified is required in practice. In this work, we propose a suspicious URL filtering approach based on multi-view analysis in order to reduce the impact from URL obfuscation techniques. URLs are composed of several portions, each portion has a specific use. The proposed method intends to learn the characteristics from multiple portions (multi-view) of URLs for giving the suspicion level of each portion. Adjusting the suspicion threshold of each portion, the proposed system would select the most suspicious URLs. This work uses the real dataset from T. Co. to evaluate the proposed system. The requests from T. Co. are (1) detection rate should be less than 25%, (2) missing rate should be lower than 25%, and (3) the process with one hour data should be end in an hour. The experiment results show that our approach is effective, is capable to reserve more malicious URLs in the selected suspicious ones and satisfy the requests given by practical environment, such as T. Co. daily works.

查看译文

关键词

web page contents,logistic regression,suspicious urls,malicious urls,regression analysis,malicious url detection,url prioritization,suspicious url filtering,current malicious urls,internet,proposed system,url obfuscation technique,obfuscated malicious urls,whole url information,suspicious url filtering approach,suspicious url,multiview analysis,t. co.,multi-view analysis,malicious url,url obfuscation techniques,security of data,feature extraction,privacy,web pages,filtering,logistics,vectors

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要