Vertical Classification of Web Pages for Structured Data Extraction.

AIRS(2012)

引用 1|浏览5
暂无评分
摘要
We propose a general hierarchical vertical classification framework, which can automatically discover the inherent hierarchical structure of relationships among verticals based on flat datasets, and then build a hierarchical classifier. We conducted a set of comparison experiments to verify the performance of it, such as with flat vs hierarchical structure of relationships, as well as among different feature selection and classification methods. Experimental results show that the hierarchical classifiers built on the basis of the proposed framework make big improvements over the flat classifiers when classifying unseen web pages. Among them, the Support Vector Machine using Odds Ratio to select discriminative features performs best. © Springer-Verlag 2012.
更多
查看译文
关键词
automatic hierarchy,hierarchical classifiers,structured data extracting,vertical classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要