Short Text Similarity Calculation Using Semantic Information

2017 3rd International Conference on Big Data Computing and Communications (BIGCOM)(2017)

引用 18|浏览40
暂无评分
摘要
Text similarity is one of the important methods of text data analysis, which is often used in text clustering and classification. Social media is a new and popular online social application that contains a lot of valuable information. Short text is common in social media, and short text similarity is often used for social media data mining. The similarity calculation of short text is influenced by the small feature of text words and the accuracy is low. so it is a common improvement method to calculate the similarity of short texts with word semantic similarity. This paper put forward a short text semantic similarity calculation method that combine knowledge-based method and corpus-based method. This method is based on the improved word semantic similarity calculation method and general short text semantic similarity calculation method. The word similarity calculation method combines two word semantic similarity by some strategies. It takes the advantages of two methods to overcome the disadvantages of single one, finds out more semantic association among words in texts, and improves accuracy of word similarity calculation. This paper uses a large number of corpus to compare and analyze several word and text semantic similarity algorithms, the improved method has a closer result to human ratings than other methods in both word and text similarity.
更多
查看译文
关键词
short text,semantic similarity,knowledge-based,corpus-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要