Parameterless Information Extraction Using (k,l)-Contextual Tree Languages

msra(2005)

引用 23|浏览11
暂无评分
摘要
Recently, several wrapper induction algorithms for structured documents have been introduced. They are based on contextual tree languages and learn from positive examples only but have the disadvantage that they need parameters. To obtain the optimal parameter setting, they use precision and recall. This goes in fact beyond learning from positive examples only. In this paper, a parameter estimation method for a wrapper based on (k; l)- contextual tree languages is introduced that is solely based on a few positive examples. Experiments show that the quality of the wrappers is very close to that of wrappers with the optimal parameter setting.
更多
查看译文
关键词
information extraction,parameter estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要