Document Categorization Using Graph Structuring

ADVANCED COMPUTATIONAL AND COMMUNICATION PARADIGMS, VOL 2(2018)

引用 0|浏览4
暂无评分
摘要
This paper proposes a document classification model using feature learning (Coates, Demystifying unsupervised feature learning, 2012) [5] approach based on semantics of the documents. In the learning phase, basic vocabulary (BV) for each document class consisting of nouns has been created by proposing a novel approach. The classification phase searches unique words in the BVs and if found, the corresponding sentence becomes a basic sentence (BS). A tree with unique words of the BS is inserted in the respective forest. Associated words of the children are used to continue the tree formation process until no newnode is generated in the tree. Finally, we assign the test document to a class which has a clearly dominant percentage of sentences in the respective forest. The proposed algorithm is compared with various feature-based classification models and satisfactory performance has been observed.
更多
查看译文
关键词
Document categorization,Basic vocabulary,Importance function
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要