Spiga-a multilingual news aggregator

Leonhard Hennig,Danuta Ploch, Daniel Prawdzik, Benjamin Armbruster, H Düwiger,Ernesto William De Luca,S Albayrak

Proceedings of GSCL(2011)

引用 8|浏览1
暂无评分
摘要
News aggregation web sites collect and group news articles from a multitude of sources in order to help users navigate and consume large amounts of news material. In this context, Topic Detection and Tracking (TDT) methods address the challenges of identifying new events in streams of news articles, and of threading together related articles. We propose a novel model for a multilingual news aggregator that groups together news articles in different languages, and thus allows users to get an overview of important events and their reception in different countries. Our model combines a vector space model representation of documents based on a multilingual lexicon of Wikipedia-derived concepts with named entity disambiguation and multilingual clustering methods for TDT. We describe an implementation of our approach on a large-scale, real-life data stream of English and German newswire sources, and present an evaluation of the Named Entity Disambiguation module, which achieves state-of-the-art performance on a German and an English evaluation dataset.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要