Classifying the Mexican epidemiological semaphore colour from the Covid-19 text Spanish news

JOURNAL OF INFORMATION SCIENCE

引用 3|浏览0
暂无评分
摘要
This work aims to generate classification models that help determine the colour of an epidemiological semaphore (ES) by analysing online news and being better prepared for the different changes in the evolution of the pandemic. To accomplish this, we introduce Cov-NES-Mex corpus, a collection of 77,983 news (labelled with the Mexican ES system) related to Covid-19 for the 32 regions of Mexico. Also, we showed measures that describe the corpus as imbalanced and with a high vocabulary overlap between classes. In addition, evaluation measurements of the pandemic by region are proposed. Furthermore, a classification model, based on a transformer architecture specialised for the Spanish language, achieved up to 0.83 of F-measure. Thus, this work provides evidence that there is essential information in the news that can be used to determine the colour of the ES up to 4 weeks in advance. Finally, the presented results could be applied to other Spanish-speaking countries, which do not have an ES system, thus inferring and comparing their situation concerning the Mexican ES.
更多
查看译文
关键词
Covid-19, epidemiological semaphore, Mexican corpus, Spanish news, text classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要