A probabilistic learning method for XML annotation of documents

international joint conference on artificial intelligence(2005)

引用 34|浏览7
暂无评分
摘要
We consider the problem of semantic annotation of semi-structured documents according to a target XML schema. The task is to annotate a document in a tree-like manner where the annotation tree is an instance of a tree class defined by DTD or W3C XML Schema descriptions. In the probabilistic setting, we cope with the tree annotation problem as a generalized probabilistic context-free parsing of an observation sequence where each observation comes with a probability distribution over terminals supplied by a probabilistic classifier associated with the content of documents. We determine the most probable tree annotation by maximizing the joint probability of selecting a terminal sequence for the observation sequence and the most probable parse for the selected terminal sequence.
更多
查看译文
关键词
XML annotation,selected terminal sequence,probable tree annotation,semantic annotation,annotation tree,probabilistic classifier,tree class,tree annotation problem,generalized probabilistic context-free,observation sequence,terminal sequence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要