A Comparative Study on Two Ground Truth Inference Algorithms based on Manually Labeled Social Media Data

2019 IEEE 16th International Conference on Networking, Sensing and Control (ICNSC)(2019)

引用 0|浏览18
暂无评分
摘要
In the booming information era, smart devices such as smart phones accompany peoples' lives all the time. Social media platforms provide users with uninterrupted communication and information acquisition including posting users' feelings and sharing ideas. This study focuses on short texts posted by users. Their true meaning is defined as ground truth. However, acquiring it from the users directly is extremely difficult and time-consuming. In other words, in many cases, short texts do not have their ground truth. Thus, we deal with a no ground truth problem. In this work, we ask for labelers to label short texts completely based on their own judgment of these texts. Two ground truth inference approaches, majority voting (MV) and positive label frequency threshold (PLAT), integrate the labels from different labelers and deduce the ground truth. We then analyze which one better suits for labeling unlabeled short texts. The work is of great significance in helping us obtain useful knowledge from massive social media data.
更多
查看译文
关键词
Social media data,short text classification,ground truth inference algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要