Learning from Noisy Similar and Dissimilar Data

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II（2021）

引用 5|浏览58

暂无评分

摘要

With the widespread use of machine learning for classification, it becomes increasingly important to be able to use weaker kinds of supervision for tasks in which it is hard to obtain standard labeled data. One such kind of supervision is provided pairwise in the form of Similar (S) pairs (if two examples belong to the same class) and Dissimilar (D) pairs (if two examples belong to different classes). This kind of supervision is realistic in privacy-sensitive domains. Although the basic version of this problem has been studied recently, it is still unclear how to learn from such supervision under label noise, which is very common when the supervision is, for instance, crowd-sourced. In this paper, we close this gap and demonstrate how to learn a classifier from noisy S and D labeled pairs. We perform a detailed investigation of this problem under two realistic noise models and propose two algorithms to learn from noisy SD data. We also show important connections between learning from such pairwise supervision data and learning from ordinary class-labeled data. Finally, we perform experiments on synthetic and real-world datasets and show our noise-informed algorithms outperform existing baselines in learning from noisy pairwise data.

查看译文

关键词

Classification, Pairwise supervision, Noisy supervision

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要