Unsupervised relation learning for event-focused question-answering and domain modelling
Unsupervised relation learning for event-focused question-answering and domain modelling(2008)
摘要
In this thesis, we investigate the problem of identifying, within a text, relations that capture information important for event-focused document collections. The presented solutions work with events of various granularity and we show how to use these relations to improve the performance of a number of natural language processing applications. For a set of related event-focused documents, we introduce a notion of a shallow semantic network based on the relations between the important elements discovered in these documents. This shallow semantic network captures the most important relations among the objects, people, and other elements that are involved in the events described in the input document collection. We present experimental evidence that such a relation-based representation of event-focused documents is superior to techniques that rely on term frequencies for the task of information selection. For a set of document collections describing similar events within the same domain, we design and implement a completely automatic, data-driven procedure for inducing domain templates. These domain templates capture facts that are important for most domain instances. We then devise a procedure for identifying commonalities across different subdomains. We experiment with a special case of a domain, a biography domain, and identify commonalities across activities used for descriptions of people belonging to different occupations. We also propose a methodology for creating domain hierarchies. We apply our methods for identifying relations to the question-answering task. We design and implement a two-pronged approach for answering open-ended event-related questions. The first approach relies on automatically created domain templates and is used when the event mentioned can be identified as one of a particular class of events (e.g., earthquakes, presidential elections). The second, complementary approach is based on a shallow semantic network, which we extract from the documents relevant to the question. We also suggest a formal model for efficient information packaging that is based on mapping the information selection task onto the set cover complexity problem. Using this mapping, we outline and implement information selection algorithms that are provably optimal polynomial-time approximations for information selection tasks that have a limit on the output size.
更多查看译文
关键词
information selection algorithm,domain template,event-focused question-answering,Unsupervised relation,domain instance,domain hierarchy,shallow semantic network,efficient information packaging,capture information,domain modelling,information selection,biography domain,information selection task
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络