CLCIFAR: CIFAR-Derived Benchmark Datasets with Human Annotated Complementary Labels

Hsiu-Hsuan Wang, Wei-I Lin,Hsuan-Tien Lin

CoRR(2023)

引用 0|浏览11
暂无评分
摘要
As a weakly-supervised learning paradigm, complementary label learning (CLL) aims to learn a multi-class classifier from only complementary labels, classes to which an instance does not belong. Despite various studies have addressed how to learn from CLL, those methods typically rely on some distributional assumptions on the complementary labels, and are benchmarked only on some synthetic datasets. It remains unclear how the noise or bias arising from the human annotation process would affect those CLL algorithms. To fill the gap, we design a protocol to collect complementary labels annotated by human. Two datasets, CLCIFAR10 and CLCIFAR20, based on CIFAR10 and CIFAR100, respectively, are collected. We analyzed the empirical transition matrices of the collected datasets, and observed that they are noisy and biased. We then performed extensive benchmark experiments on the collected datasets with various CLL algorithms to validate whether the existing algorithms can learn from the real-world complementary datasets. The dataset can be accessed with the following link: https://github.com/ntucllab/complementary_cifar.
更多
查看译文
关键词
benchmark datasets,complementary labels,human annotated
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要