Linguistic Redundancy in Twitter.

EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing(2011)

引用 17|浏览71
暂无评分
摘要
In the last few years, the interest of the research community in micro-blogs and social media services, such as Twitter, is growing exponentially. Yet, so far not much attention has been paid on a key characteristic of micro-blogs: the high level of information redundancy. The aim of this paper is to systematically approach this problem by providing an operational definition of redundancy. We cast redundancy in the framework of Textual Entailment Recognition. We also provide quantitative evidence on the pervasiveness of redundancy in Twitter, and describe a dataset of redundancy-annotated tweets. Finally, we present a general purpose system for identifying redundant tweets. An extensive quantitative evaluation shows that our system successfully solves the redundancy detection task, improving over baseline systems with statistical significance.
更多
查看译文
关键词
information redundancy,redundancy detection task,baseline system,extensive quantitative evaluation,general purpose system,quantitative evidence,Textual Entailment Recognition,high level,key characteristic,operational definition,Linguistic redundancy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要