Does resistance to style-transfer equal Global Shape Bias? Measuring network sensitivity to global shape configuration
CoRR(2023)
摘要
Deep learning models are known to exhibit a strong texture bias, while human
tends to rely heavily on global shape structure for object recognition. The
current benchmark for evaluating a model's global shape bias is a set of
style-transferred images with the assumption that resistance to the attack of
style transfer is related to the development of global structure sensitivity in
the model. In this work, we show that networks trained with style-transfer
images indeed learn to ignore style, but its shape bias arises primarily from
local detail. We provide a Disrupted Structure Testbench (DiST) as a
direct measurement of global structure sensitivity. Our test includes 2400
original images from ImageNet-1K, each of which is accompanied by two images
with the global shapes of the original image disrupted while preserving its
texture via the texture synthesis program. We found that (1)
models that performed well on the previous cue-conflict dataset do not fare
well in the proposed DiST; (2) the supervised trained Vision Transformer (ViT)
lose its global spatial information from positional embedding, leading to no
significant advantages over Convolutional Neural Networks (CNNs) on DiST. While
self-supervised learning methods, especially mask autoencoder significantly
improves the global structure sensitivity of ViT. (3) Improving the global
structure sensitivity is orthogonal to resistance to style-transfer, indicating
that the relationship between global shape structure and local texture detail
is not an either/or relationship. Training with DiST images and
style-transferred images are complementary, and can be combined to train
network together to enhance the global shape sensitivity and robustness of
local features. Our code will be hosted in github:
https://github.com/leelabcnbc/DiST
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要