Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construction from Corpus.

CIKM '14: 2014 ACM Conference on Information and Knowledge Management Shanghai China November, 2014(2014)

引用 5|浏览28
暂无评分
摘要
In this work we propose an unsupervised framework to construct a shallow domain ontology from corpus. It is essential for Information Retrieval systems, Question-Answering systems, Dialogue etc. to identify important concepts in the domain and the relationship between them. We identify important domain terms of which multi-words form an important component. We show that the incorporation of multi-words improves parser performance, resulting in better parser output, which improves the performance of an existing Question-Answering system by upto 7%. On manually annotated smartphone dataset, the proposed system identifies 40:87% of the domain terms, compared to 22% recall obtained using WordNet, 43:77% by Yago and 53:74% by BabelNet respectively. However, it does not use any manually annotated resource like the compared systems. Thereafter, we propose a framework to construct a shallow ontology from the discovered domain terms by identifying four domain relations namely, Synonyms ('similar-to'), Type-Of ('is-a'), Action-On ('methods') and Feature-Of ('attributes'), where we achieve significant performance improvement over WordNet, BabelNet and Yago without using any mode of supervision or manual annotation.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要