APTTOOLNER: A Chinese Dataset of Cyber Security Tool for NER Task

2023 3rd Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS)(2023)

引用 0|浏览2
暂无评分
摘要
The existing named entity recognition work in cyber security focuses on the threat intelligence field, and two problems appeared: 1) The language for NER datasets in threat intelligence mainly is in English. 2) The definition of named entities in the field of threat intelligence is confusing. For these two issues, we collected 244 APT analysis reports and analyzed the flow of threat intelligence, then predefined 7 APT tool related (which is the base of threat intelligence) entities. Two experts labeled the reports twice for consistent understanding. Finally, we got a dataset for APT tool entity in Chinese with 31423 predefined entity tags and 424930 O tags, and the dataset was validated on a regular NER model.
更多
查看译文
关键词
Chinese NER dataset,APT tool dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要