Keyword Detection in Natural Language Based on Statistical Mechanics of Words in Written Texts

Clinical Orthopaedics and Related Research(2009)

引用 23|浏览9
暂无评分
摘要
In this work, we suggest a parameterized statistical model (the gamma distribution) for the frequency of word occurrences in long strings of english text and use this model to build a corresponding thermody- namic picture by constructing the partition function. We then use our partition function to compute ther- modynamic quantities such as the free energy and the specific heat. In this approach, the parameters of the word frequency model vary from word to word so that each word has a different corresponding thermo- dynamics and we suggest that differences in the spe- cific heat reflect differences in how the words are used in language, differentiating keywords from common and function words. Finally, we apply our thermo- dynamic picture to the problem of retrieval of texts based on keywords and suggest some advantages over traditional information retrieval methods.
更多
查看译文
关键词
natural language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要