An Efficient Framework By Topic Model For Multi-Label Text Classification

2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)(2019)

引用 0|浏览128
暂无评分
摘要
Most existing multi-label text classification (MLTC) approaches only exploit label correlations from label pairwises or label chains. However, in the real world, features of instances have much importance for classification. In this paper, we propose a simple but efficient framework for MLTC called Hybrid Latent Dirichlet Allocation Multi-Label (HLDAML). To be specific, the topics of text features (i.e., a concrete description of documents) and the topics of label sets (i.e., a summarization of documents) can be obtained from training data by topic model before building models for multi-label classification. After that, hybrid topics can be used in existing approaches to improve the performance of MLTC. Experiments on several benchmark datasets demonstrate that the proposed framework is general and effective when taking text features and label sets into consideration simultaneously. It is also worth mentioning that we construct a new multi-label dataset called Parkinson about diagnosing parkinson disease by Traditional Chinese Medicine.
更多
查看译文
关键词
multi-label text classification, topic model, label correlations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要