TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification

Neural Processing Letters(2024)

引用 0|浏览5
暂无评分
摘要
Extreme multi-label text classification (XMTC) annotates related labels for unknown text from large-scale label sets. Transformer-based methods have become the dominant approach for solving the XMTC task due to their effective text representation capabilities. However, the existing Transformer-based methods fail to effectively exploit the correlation between labels in the XMTC task. To address this shortcoming, we propose a novel model called TLC-XML, i.e., a Transformer with label correlation for extreme multi-label text classification. TLC-XML comprises three modules: Partition, Matcher and Ranker. In the Partition module, we exploit the semantic and co-occurrence information of labels to construct the label correlation graph, and further partition the strongly correlated labels into the same cluster. In the Matcher module, we propose cluster correlation learning, which uses the graph convolutional network (GCN) to extract the correlation between clusters. We then introduce these valuable correlations into the classifier to match related clusters. In the Ranker module, we propose label interaction learning, which aggregates the raw label prediction with the information of the neighboring labels. The experimental results on benchmark datasets show that TLC-XML significantly outperforms state-of-the-art XMTC methods.
更多
查看译文
关键词
Extreme multi-label text classification,Label correlation,Graph convolutional network,Transformer model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要