Estimating The Prevalence And Diversity Of Words In Written Language

QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY(2020)

引用 19|浏览3
暂无评分
摘要
Recently, a new crowd-sourced language metric has been introduced, entitled word prevalence, which estimates the proportion of the population that knows a given word. This measure has been shown to account for unique variance in large sets of lexical performance. This article aims to build on the work of Brysbaert et al. and Keuleers et al. by introducing new corpus-based metrics that estimate how likely a word is to be an active member of the natural language environment, and hence known by a larger subset of the general population. This metric is derived from an analysis of a newly collected corpus of over 25,000 fiction and non-fiction books and will be shown that it is capable of accounting for significantly more variance than past corpus-based measures.
更多
查看译文
关键词
Lexical organisation, semantic diversity, big data, corpus studies
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要