Space Efficiencies in Discourse Modeling via Conditional Random Sampling.

Brian Kjersten,Benjamin Van Durme

NAACL HLT '12: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies（2012）

引用 1|浏览5

暂无评分

摘要

Recent exploratory efforts in discourse-level language modeling have relied heavily on calculating Pointwise Mutual Information (PMI), which involves significant computation when done over large collections. Prior work has required aggressive pruning or independence assumptions to compute scores on large collections. We show the method of Conditional Random Sampling, thus far an underutilized technique, to be a space-efficient means of representing the sufficient statistics in discourse that underly recent PMI-based work. This is demonstrated in the context of inducing Shankian script -like structures over news articles.

查看译文

关键词

large collection,prior work,recent exploratory effort,underly recent PMI-based work,Conditional Random Sampling,Pointwise Mutual Information,Shankian script-like structure,aggressive pruning,discourse-level language modeling,independence assumption,conditional random sampling,discourse modeling,space efficiency

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要