Research on the construction of event corpus with document-level causal relations for social security

Inf. Process. Manag.(2023)

引用 0|浏览0
暂无评分
摘要
Event corpora are imperative to train event extraction models. Currently, most existing event corpora suffer from being available only in English, and their construction is limited by high annotation costs. This paper aims to construct a corpus that concerns social security causality events in Chinese and proposes a faster and less expensive construction method. The contributions are as follows: (i) An event corpus SSECau for the social security field in Chinese is constructed. They are from 2,235 web texts and microblogs, with event causality annotated at the document level. (ii) A corpus construction method with manual annotation and machine pre- tagging is proposed to improve accuracy and speed. (iii) A pre-tagging method based on BiLSTM-CRF (bidirectional long short-term memory and conditional random field) is deployed to extract events automatically. The experimental results show the best consistency between automatic pre-tagging and manual annotation can reach up to 82 %; while the dynamic tagging process improves both the labeling speed and accuracy. The SSECau corpus can aid the development and evaluation of event extraction models for the social security field; annotated cause effect relationships at the document level can potentially enhance the training of complex extraction models; the proposed dynamic process with pretagging can serve as a reference for future corpus construction.
更多
查看译文
关键词
Event extraction,Causal relation,Tagging process,Event corpus,Social security
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要