A multi-label emoji classification method using balanced pointwise mutual information-based feature selection

COMPUTER SPEECH AND LANGUAGE(2022)

引用 7|浏览6
暂无评分
摘要
The availability of social media such as twitter allows users to express their feeling, emotions and opinions toward a topic. Emojis are graphic symbols that are regarded as the new generation of emoticons and an effective way of conveying feelings and emotions in social media. With the surging popularity of Emojis, the researchers in the area of Emotion Classification strive to understand the emotion correlated to each Emoji. Two of the most the successful approaches in emoji analysis rely on: 1) official Unicode description and 2) manually built emoji lexicons. Since the use of emoji is socially determined, the former approach is not aligned with intended semantic and usage, which leads researchers to opt for emoji lexicons. To overcome problem of lexiconbased approach, we proposed a method to classify emojis automatically. Therefore, we present a modified Pointwise Mutual Information (PMI) method, called Balanced Pointwise Mutual Information-Based (B-PMI), to develop a balanced weighted emoji classification based on the semantic similarity. Further, deep neural network is used to represent emoji in vector form (emoji embedding) to extend the pre-trained word embeddings. We carefully evaluated the proposed method in multiple twitter datasets that are employed in sentiment and emotion classification using machine learning (ML) and deep learning (DL) approaches. In both approaches, extending word embedding with the proposed emoji embedding improved results. The DL-based approach achieved the highest f1-score of 70.01% for sentiment classification, and accuracy score of 56.36% for emotion classification. ML-based approach obtained accuracy score of 52.17% in emotion classification.
更多
查看译文
关键词
Multi -label classification, Emotion classification, Emoji classification, Emoji lexicon, Sentiment analysis, Pointwise Mutual Information, Natural Language Processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要