DSS: Text Similarity Using Lexical Alignments of Form, Distributional Semantics and Grammatical Relations.

SemEval '12: Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation(2012)

引用 2|浏览31
暂无评分
摘要
In this paper we present our systems for the STS task. Our systems are all based on a simple process of identifying the components that correspond between two sentences. Currently we use words (that is word forms), lemmas, distributional similar words and grammatical relations identified with a dependency parser. We submitted three systems. All systems only use open class words. Our first system (alignheuristic) tries to obtain a mapping between every open class token using all the above sources of information. Our second system (wordsim) uses a different algorithm and unlike alignheuristic, it does not use the dependency information. The third system (average) simply takes the average of the scores for each item from the other two systems to take advantage of the merits of both systems. For this reason we only provide a brief description of that. The results are promising, with Pearson's coefficients on each individual dataset ranging from .3765 to .7761 for our relatively simple heuristics based systems that do not require training on different datasets. We provide some analysis of the results and also provide results for our data using Spearman's, which as a nonparametric measure which we argue is better able to reflect the merits of the different systems (average is ranked between the others).
更多
查看译文
关键词
different algorithm,different datasets,different system,dependency information,dependency parser,open class,open class word,simple heuristics,simple process,STS task,distributional semantics,grammatical relation,lexical alignment,text similarity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要