Data Collection and Annotation Pipeline for Social Good Projects.

AI4SG@AAAI Fall Symposium(2020)

引用 2|浏览8
暂无评分
摘要
Vast amounts of data are generated during crisis events through both formal and informal sources, and this data can be used to make a positive impact in all phases of crisis events. However, collecting and annotating data quickly and effectively in the face of crises is a challenging task. Crises require quick, robust, and efficient annotation to best respond to unfolding events. Data must be accessed and aggregated across different platforms and sources, and annotation tools must be able to utilize this data effectively. This work describes an architecture built for rapid collection and annotation of data from multiple sources which can then be built into machine learning and data analysis models. We extract data from social media via multiple systems for Twitter data collection, as well as building architecture for the collection of news articles from diverse sources. These can then be input into the INCEpTION annotation framework, which has been adapted to allow for easy management of multiple annotators, aiming to improve functionality to facilitate the application of citizen science. This allows us to rapidly prototype new annotation schema across a diverse array of data sources, which can then be deployed for machine learning. As a use case, we explore annotation of COVID-19 related Tweets and news articles for case prediction.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要