Creating RSS for News Archives, Beyond

FLAIRS Conference(2006)

引用 24|浏览7
暂无评分
摘要
RSS or Rich Site Summary is becoming an invaluable format/tool for news feeds. More and more news pub- lishing organizations are realizing its benefits. Content publishers are joining the already heavily crowded RSS club1. In the era of information explosion and peer-to- peer sharing, RSS is a great format for doing content publishing, archiving, sharing and much more. How- ever, it came late. We realize that this should have started at the same time Internet became popular and news organizations are making their on-line debut. Dur- ing the last decade, an enormous amount of news arti- cles had already been published, and (at the same time,) improperly archived due to the lack of a flexible and widely accepted format of archival. However, better late than never. As we now explore possibilities of RSS, this is the time to make the transition smooth for old un- formatted news articles and make it uniform across all (new and old) news articles. To do that we realized that extracting metadata of old news articles is one of the ways to create their RSS versions. In this paper we talk about our progress in extracting news metadata with the use of support vector classifier and show that an order- ing of applying the classifiers is more useful than apply- ing them in random order. We also show preliminary results on applying TIMEX tags to extract news events, which can be very useful to go beyond RSS to create individual event lines instead of taking the whole story under a single timeline.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要