A Domain-Independent and Multilingual Approach for Crisis Event Detection and Understanding

Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval(2019)

引用 2|浏览15
暂无评分
摘要
Most existing approaches that use social media for detecting and characterizing emerging crisis events are based on the analysis of messages obtained from social platforms using a predetermined set of keywords [2, 3]. In addition to keyword filters, messages must commonly be post-processed using supervised classification models to determine if messages are referring to a real-time crisis situation or not. However, keyword-based approaches have certain shortcomings; on the one hand they require specific domain knowledge of different crisis events to determine a set of keywords to filter relevant data about an emerging crisis situation; on the other hand, they require supervised methods to determine if the identified data actually corresponds to a new real-time crisis event. Hence, the creation of keyword-independent methods could also help generalize existing approaches so they can be used for cross-lingual events, since each language and culture can have its own particular terms to refer to a same event. The majority of these works also explain phenomenons just for English messages. This limitation avoids replication of methodologies in other languages and countries where emergency events often occur. For this reason, researchers recently have focused on creating domain-independent and multi-lingual approaches for detecting and classifying social media messages during crisis events [1, 4]. These approaches have exploited low-level lexical features with the goal of reaching domain-transfer among different crisis events and languages. Nonetheless, most studies focused on crisis-related messages without testing non related crisis messages such as sporting events or music festivals. The main objective of this work is to study and exploit cross-lingual domain-independent patterns for detecting and characterizing social media messages generated in collective activity related to unexpected high-impact real-world events in social media platforms, and specifically on emergency situations. The expected contribution is to develop novel techniques in order to provide multi-lingual and domain-independent detections, and characterizations of emergency situations. Such techniques should help us better understand the social media behavior during crises in affected locations around the world, independent of their language, domain and type of event. Some of these patterns arise independently of the particular type or domain of the crisis event, as well as independent of the location, language and culture of the users that participate. Hence, our hypothesis is that there are patterns in the self-organized activity of the Web and social media users that emerge when a crisis situation starts to unfold in the physical world. Some of these patterns arise independently of the particular type or domain of the crisis event, as well as independent of the location, language and culture of the users that participate. We then propose the following research questions: RQ1: Can we characterize collective patterns during crisis situations independently of their language and domain based on non-textual and low-level lexical features? RQ2: Are there differences among types of emergency situations (instantaneous, progressives, focalized and diffused) related to social media messages posted during these events? RQ3: Are non-textual and low-level lexical features sufficient at reducing the number of non-related emergency situations detected as crises in the Web and social media?
更多
查看译文
关键词
crisis informatics, emergency situations, social media
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要