Automatic Creation Technologies of Declarative Tools for Clustering Media Documents

2019 International Conference on Engineering Technologies and Computer Science (EnT)(2019)

引用 3|浏览0
暂无评分
摘要
The article describes the methods of identifying the conceptual content structure of the dataset of documents for the clustering. It was found that in the automatic extraction of key text concepts it is necessary to use the criteria of semantic significance of words and phrases obtained on the basis of syntactic, statistical and semantic methods. The syntactic criteria are based on the definition of the syntactic role of words and phrases in the text dataset. We accent on those elements of sentences that forms its semantic (predicate-actant) structure. In this research four methods of automatic identification of key text concepts have been elaborated, their comparative analysis is carried out and the technology of automatic creation of declarative means for text clustering of media is developed. The precision assessment of document clustering with and without declarative methods is conducted on test dataset.
更多
查看译文
关键词
document clustering, automated text processing, formal text description, linguistic software, declarative tools, key text concepts
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要