TAPON: A two-phase machine learning approach for semantic labelling.

Knowledge-Based Systems(2019)

引用 18|浏览16
暂无评分
摘要
Through semantic labelling we enrich structured information from sources such as HTML pages, tables, or JSON files, with labels to integrate it into a local ontology. This process involves measuring some features of the information and then finding the classes that best describe it. The problem with current techniques is that they do not model relationships between classes. Their features fall short when some classes have very similar structures or textual formats. In order to deal with this problem, we have devised TAPON: a new semantic labelling technique that computes novel features that take into account the relationships. TAPON computes these features by means of a two-phase approach. In the first phase, we compute simple features and obtain a preliminary set of labels (hints). In the second phase, we inject our novel features and obtain a refined set of labels. Our experimental results show that our technique, thanks to our rich feature catalogue and novel modelling, achieves higher accuracy than other state-of-the-art techniques.
更多
查看译文
关键词
Semantic labelling,Information integration,Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要