An Improved Document Clustering Approach With Multi-Viewpoint Based On Different Similarity Measures

Aniali Gunta,Rahul Dubey

PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS)(2018)

引用 2|浏览0
暂无评分
摘要
Electronic information such as online newspapers, journals, conference proceedings, Web sites, e-mails, etc. They are-growing very fast in extremely large amount. Using all this electronic information controlling, indexing or searching is not possible for human and for search engines also for such a huge amount of large data. Therefore, automatic document organization become a critical issue. With the help of document clustering methods, we can understand data distribution or we can preprocess data for other applications. For an instance, Search engine can produce results more effectively and efficiently if a search engine uses documents those are clustered to search an item or data.Document clustering is an automatic clustering operation and also it is a technique of an unsupervised learning. It combines related documents in one cluster and unrelated documents in different clusters so each cluster consist of documents that are related to one another within the same clusters and are unrelated to documents belonging to other cluster. For applying any clustering methods, it is necessary to calculate similarity measure. The similarity measure is used to find out the degree of closeness or degree of similarity of the target objects. In this paper, we introduce document clustering on Multiview point-based similarity measure and two related document clustering methods. The existing document clustering dissimilarity/similarity measure uses only a single viewpoint, which is the origin that means it uses only one reference point, while the ours use many different viewpoints of references.
更多
查看译文
关键词
text mining, document clustering, information extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要