Categorizing relational facts from the web with fuzzy rough sets

Knowledge and Information Systems(2018)

引用 5|浏览13
暂无评分
摘要
Significant advances have been made in automatically constructing knowledge bases of relational facts derived from web corpora. These relational facts are linguistic in nature and are represented as ordered pairs of nouns (Winnipeg, Canada) belonging to a category (City_Country). One major problem is that these facts are abundant but mostly unlabeled. Hence, semi-supervised learning approaches have been successful in building knowledge bases where a small number of labeled examples are used as seed (training) instances and a large number of unlabeled instances are learnt in an iterative fashion. In this paper, we propose a novel fuzzy rough set-based semi-supervised learning algorithm (FRL) for categorizing relational facts derived from a given corpus. The proposed FRL algorithm is compared with a tolerance rough set-based learner (TPL) and the coupled pattern learner (CPL). The same ontology derived from a subset of corpus from never ending language learner system was used in all of the experiments. This paper has demonstrated that the proposed FRL outperforms both TPL and CPL in terms of precision. The paper also addresses the concept drift problem by using mutual exclusion constraints. The contributions of this paper are: (i) introduction of a formal fuzzy rough model for relations, (ii) a semi-supervised learning algorithm, (iii) experimental comparison with other machine learning algorithms: TPL and CPL, and (iv) a novel application of fuzzy rough sets.
更多
查看译文
关键词
Text categorization,Relational facts,Semi-supervised learning,Fuzzy rough sets,Web mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要