Classifying Vietnamese Disease Outbreak Reports with Important Sentences and Rich Features

Son Doan, Nguyen Thi Ngoc Vinh,Tu Minh Phuong

SoICT '12: Proceedings of the 3rd Symposium on Information and Communication Technology(2019)

引用 1|浏览0
暂无评分
摘要
Text classification is an important field of research from mid 90s up to now. It has many applications, one of them is in Web-based biosurveillance systems which identify and summarize online disease outbreak reports. In this paper we focus on classifying Vietnamese disease outbreak reports. We investigate important properties of disease outbreak reports, e.g., sentences containing names of outbreak disease, locations. Evaluation on 10-time 10- fold cross-validation using the Support Vector Machine algorithm shows that using sentences containing disease outbreak names with its preceding/following sentences in combination with location features achieve the best F-score with 86.67 results suggest that using important sentences and rich feature can improve performance of Vietnamese disease outbreak text classification.
更多
查看译文
关键词
important property,raw text,important field,online disease outbreak report,vietnamese disease outbreak text,disease outbreak report,outbreak disease,rich feature,disease outbreak name,vietnamese disease outbreak report,important sentence,support vector machine,machine learning,disease outbreak,cross validation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要