WAIN: AutomaticWeb Application Identification and Naming Method

Asia-Pacific Symposium on Internetware (Internetware)(2022)

引用 0|浏览4
As the defense shifts from vulnerability-centric to threat-centric and efficient security architecture can exclusively be constructed with adequate comprehension of the threat of the critical assets. In order to classify and identify the assets, the recognition and naming of the Web applications are the fundamental approaches. At present, the traditional Web application identification methods mainly rely on rules matching, which are extracted from the Web pages by manual analysis. This low coverage and labor-consuming method, which is not suitable for this time of explosive growth in Web applications and inevitably leaves some uncommon applications unrecognized and at risk. In this paper, we propose WAIN, an automatic method for Web application identification and naming, it first clusters different types of applications in numerous samples using K-Means algorithm, and then leverages a novel TF-IDF calculation method to extract keyword. After that, LDA is applied to explain why some parts of data are similar and extract possible fingerprints. Finally, WAIN utilizes filters and a statistic means to generate possible names for clusters. When evaluating, data from 30,000 instances of eight kinds of Web applications is processed, and the generated fingerprints and names can distinguish each type of application in the dataset. We manually checked all the results and found that fingerprints and at least one name that summarizes at least one of the product names, manufacturers, and functions are successfully generated for each kind of application.
AI 理解论文
Chat Paper