FLAG : Fast Large-Scale Graph Construction for NLP

semanticscholar(2012)

引用 0|浏览2
暂无评分
摘要
Many natural language processing (NLP) problems involve constructing large nearest-neighbor graphs between word pairs by computing distributional similarity between word pairs from large corpora. In this paper, first we describe a system called FLAG to construct such graphs approximately from large data sets. To handle the large amount of data in memory and time efficient manner, FLAG maintains approximate counts based on sketching algorithms using commodity clusters. To find the approximate nearest neighbors quickly, FLAG exploits fast approximate nearest neighbor search algorithms. Second, we describe an extension of system FLAG that models for inferring context sensitive meaning of words. We propose an approximate Clustering by Committee (CBC) algorithm to induce hard clusters of words. These hard clusters are mapped to words in context to infer their context sensitive meaning.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要