Leveraging Dependency Regularization for Event Extraction.

FLAIRS Conference(2016)

引用 23|浏览40
暂无评分
摘要
Event Extraction (EE) is a challenging Information Extraction task which aims to discover event triggers with specific types and their arguments. Most recent research on Event Extraction relies on pattern-based or feature-based approaches, trained on annotated corpora, to recognize combinations of event triggers, arguments, and other contextual information. These combinations may each appear in a variety of linguistic forms. Not all of these event expressions will have appeared in the training data, thus adversely affecting EE performance. In this paper, we demonstrate the overall effectiveness of Dependency Regularization techniques to generalize the patterns extracted from the training data to boost EE performance. We present experimental results on the ACE 2005 corpus, showing improvement over the baseline system, and consider the impact of the individual regularization rules. Introduction Event Extraction (EE) involves identifying instances of specified types of events and the corresponding arguments in text, which is an important but difficult Information Extraction (IE) task. Associated with each event mention is a phrase, the event trigger (most often a single verb or nominalization), which evokes that event. More precisely, our task involves identifying event triggers associated with corresponding arguments and classifying them into specific event types. For instance, according to the ACE 2005 annotation guidelines1, in the sentence “[She] was killed by [an automobile] [yesterday]”, an event extraction system should be able to recognize the word “killed” as a trigger for the event DIE, and discover “an automobile” and “yesterday” as the Agent and Time Arguments. This task is quite challenging, as the same event might appear in the form of various trigger expressions and an expression might represent different events in different contexts. Most recent research on Automatic Content Extraction (ACE) Event Extraction relies on pattern-based or featurebased approaches to building classifiers for event trigger and argument labeling. Although the training corpus is quite Copyright c © 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. https://www.ldc.upenn.edu/ sites/www.ldc.upenn.edu/files/ english-events-guidelines-v5.4.3.pdf large (300,000 words), the test data will inevitably contain some event expressions that never occur in the training data. To address this problem, we propose several Dependency Regularization methods to help generalize the syntactic patterns extracted from the training data in order to boost EE performance. Among the syntactic representations, dependency relations serve as important features or part of a pattern-based framework in IE systems, and play a significant role in IE approaches. These proposed regularization rules will be applied either to the dependency parse outputs of the candidate sentences or to the patterns themselves to facilitate detecting the event instances. The experimental results demonstrate that our pattern-based system with the expanded patterns can achieve substantial improvement over the baseline, which is an advance over the state-of-the-art systems. The paper is organized as follows: we first describe the role of dependency analysis in event extraction and how dependency regularization methods can improve EE performance. In the sections which follow, we describe our EE systems including the baseline and enhanced system utilizing dependency regularization, we present experimental results, and we discuss related work. Dependency Regularization The ACE 2005 Event Guidelines specify a set of 33 types of events, and these have been widely used for research on event extraction over the past decade. Some trigger words are unambiguous indicators of particular types of events. For example, the word murder indicates an event of type DIE. However, most words have multiple senses and so may be associated with multiple types of events. Many of these cases can be disambiguated based on the semantic types of the trigger arguments: • fire can be either an ATTACK event (“fire a weapon”) or END-POSITION event (“fire a person”), with the cases distinguishable by the semantic type of the direct object. discharge has the same ambiguity and the same disambiguation rule. • leave can be either a TRANSPORT event (“he left the building”) or an END-POSITION event (“he left the administration”), again generally distinguishable by the type of the direct object. Given a training corpus annotated with triggers and event arguments we can assemble a set of frames and link them to particular event types. Each frame will record the event arguments and their syntactic (dependency) relation to the trigger. When decoding new text, we will parse it with a dependency parser, look for a matching frame, and tag the trigger candidate with the corresponding event type. One complication is that the frames may be embedded in different syntactic structures: verbal and nominal forms, relative clauses, active and passive voice, etc. Because of the limited size of the training corpus, some triggers will appear with frames not seen in the training corpus. To fill these gaps, we will employ a set of dependency regularization rules which transform the syntactic structure of the input to reduce variation. We describe here three of the regularization rules we use: 1. Verb Chain regularization 2. Passive Voice Regularization 3. Relative Clause Regularization Verb Chain Regularization We use a fast dependency parser (Tratz and Hovy 2011) that analyzes multi-word verb groups (with auxiliaries) into chains with the first word at the head of the chain. Verb Chain (vch) Regularization reverses the verb chains to place the main (final) verb at the top of the dependency parse tree. This reduces the variation in the dependency paths from trigger to arguments due to differences in tense, aspect, and modality. Here is an example sentence containing a verb chain: Kobe has defeated Michael . (1)
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要