Where Did the Political News Event Happen? Primary Focus Location Extraction in Different Languages

2019 IEEE 5th International Conference on Collaboration and Internet Computing (CIC)(2019)

引用 4|浏览50
暂无评分
摘要
Political news reports are populated all over the world in various languages. It has a great value to automatically detect the geolocation from these reports for a better understanding of the associated events. Although various open-source and commercial tools exist to identify geolocation, they fail to identify at a granular level such as locality or city and they do not support most languages. Most of the techniques view the problem in terms of Named Entity Recognition (NER) and identify geolocation information at the country level for a given text. In this paper, we consider English, Spanish and Arabic news articles from different publishers. We define primary focus location as the actual location where the event occurred amongst other focus locations mentioned in the report. Our aim is to extract the primary focus location regardless of the language from articles belonging to different news agencies. We propose a mechanism to identify potential sentences containing focus locations using NER. After that, we perform sentence embedding over words from different languages and then employ a supervised classification mechanism to predict the primary focus location. We also perform bias correction over the training data using a suitable adaptation mechanism to reduce the sampling bias in training data. Our method trains a classifier using bias-corrected training data from news articles published by an agency in one language, while testing the model on news articles published by another agency in a different language. Our empirical results when compared to baseline approaches show superior performance on real-world English, Spanish and Arabic news articles.
更多
查看译文
关键词
Focus Location Extraction, Sentence Embedding, Bias Correction, Political Event News
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要