Disease discovery-based emotion lexicon: a heuristic approach to characterise sicknesses in microblogs
NETWORK MODELING AND ANALYSIS IN HEALTH INFORMATICS AND BIOINFORMATICS(2020)
摘要
The analysis of microblogging data has been widely used to discover valuable resources for timely identification of critical illness-related incidents and serious epidemics. Despite the numerous efforts made in this field, making an accurate and timely prediction of incidents and outbreaks based on certain clinical symptoms remains a great challenge. Hence, providing an investigative method can be crucial in characterising a disease state. This study proposes a heuristic mechanism by using an unsupervised learning technique to efficiently detect disease incidents and outbreaks from the tweet content. We categorised the types of emotions that are highly linked to a specific disease and its related terminologies. Emotions (anger, fear, sadness, and joy) and diabetes-related terminologies were extracted using the NRC Affect Intensity Lexicon and part-of-speech tagging tool. A two-cluster solution was established and validated. The classification results showed that K-means clustering with two centroids had the highest classification accuracy (96.53%). The relationship between diabetes-related terms (in the form of tweets) and emotions were established and assessed using the association rules mining technique. The results showed that diabetes-related terms were exclusively associated with fear emotions. This study offers a novel mechanism for disease recognition and outbreak detection in microblogs that can be useful in making informed decisions about a disease state.
更多查看译文
关键词
Diabetes,Emotion lexicon,Disease detection,Part-of-speech tagging,Association rules mining,Twitter
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要