A novel Chinese–Tibetan mixed-language rumor detector with multi-extractor representations

Computer Speech & Language(2024)

引用 0|浏览0
暂无评分
摘要
Rumors can easily propagate through social media, posing potential threats to both individual and public health. Most existing approaches focus on single-language rumor detection, which leads to unsatisfying performance when these are applied to mixed-language rumor detection. Meanwhile, the type of mixed-language (mixture of word-level or sentence-level) is a great challenge for mixed-language rumor detection. In this paper, focusing on a mixed scene of Chinese and Tibetan, the research first provides a Chinese–Tibetan mixed-language rumor detection dataset (Weibo_Ch_Ti) that comprises 1,617 non-rumor tweets and 1,456 rumor tweets in two mixed-language types. Then, the research proposes an effective model with multi-extractors, named ”MER-CTRD” for short. This model mainly consists of three extractors. The Multi-task Extractor helps the model to extract feature representations of different mixed-language types adaptively. The Rich-semantic Extractor enriches the semantic features representations of Tibetan in the Chinese–Tibetan-mixed language. The Fusion-feature Extractor fuses the mean and disparity semantic features of Chinese and Tibetan to complement feature representations of the mixed language. Finally, the research conducts experiments on Weibo_Ch_Ti. The results show that the proposed model improves accuracy by about 3%–12% over the baseline models, indicating its effectiveness in the Chinese–Tibetan mixed-language rumor detection scenario.
更多
查看译文
关键词
Rumor detection,Chinese Tibetan mixed language,Dataset,Multi extractor,Feature representations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要