The Ngram Statistics Package (Text: : NSP) : A Flexible Tool for Identifying Ngrams, Collocations, and Word Associations.

MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World(2011)

引用 11|浏览0
暂无评分
摘要
The Ngram Statistics Package (Text::NSP) is freely available open-source software that identifies ngrams, collocations and word associations in text. It is implemented in Perl and takes advantage of regular expressions to provide very flexible tokenization and to allow for the identification of non-adjacent ngrams. It includes a wide range of measures of association that can be used to identify collocations.
更多
查看译文
关键词
non-adjacent ngrams,Ngram Statistics Package,available open-source software,flexible tokenization,regular expression,wide range,word association,flexible tool,ngram statistics package
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要