Hybrid approach for text similarity detection in Vietnamese based on Sentence-BERT and WordNet

2022 4th International Conference on Information Technology and Computer Communications (ITCC)(2022)

引用 0|浏览18
暂无评分
摘要
In this paper, we explore the task of similarity detection, which determines whether two sentences have the same meaning. Although the task has shown to be important in many natural language processing applications, not much work has been done in Vietnamese. We present an approach based on Sentence-BERT (SBERT) model. Leveraging the pre-trained model and combining it with linguistic knowledge (WordNet), we then tested it on two popular Vietnamese datasets: vnPara and VNPC. Our best model achieves 97.62% F1 score on vnPara and 95.31% F1 score on VNPC.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要