Self-Supervised Learning in the Twilight of Noisy Real-World Datasets.

ICMLA(2022)

引用 0|浏览9
暂无评分
摘要
Despite the effort toward benchmarking self-supervised learning (SSL) methods for image recognition transfer learning tasks, our understanding is still limited about their performance on noisy real-world datasets. This paper presents an extensive analysis of various types of SSL methods on real-world datasets containing noisy images of wildlife animals. These uncurated images are auto-captured by motion-activated cameras or camera traps installed in the wild. The camera-trap datasets exhibit different types of biases typically present in practical tasks. Using a set of variably-size biased datasets, we compare the supervised learning (SL) method to two types of SSL methods, i.e., instance discrimination and cluster discrimination. Our results reveal nuances in SSL’s performance. For example, we show that though SSL methods are often more generalizable than the SL method, the performance gain of some SSL methods diminishes with the reduction in the size of the target dataset. Also, there exists significant variability in the effectiveness of the two types of SSL methods. In addition to this, we show that, unlike SL, both types of SSL gain from increased model capacity.
更多
查看译文
关键词
self-supervised learning,generalizability,representation learning,real-world datasets,biased datasets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要