A Methodology for Semantically Annotating a Corpus Using a Domain Ontology and Machine Learning

Recent Advances in Natural Language Processing - RANLP(2003)

引用 36|浏览10
暂无评分
摘要
In this paper we present a methodology for the semantic annotation of domain-specific corpora. This method relies on a domain ontology used initially for identifying and annotating domain- specific instances within the corpus. A machine learning-based information extraction system is then trained on the annotated corpus. The final result of this process is a model which is used to annotate new corpora in the specific domain. We applied the proposed methodology to a Web corpus examining different ontology size using hidden Markov models. The paper presents the proposed methodology together with some first experimental results.
更多
查看译文
关键词
hidden markov model,machine learning,information extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要