Another scale-guided parallel transformer for image aesthetic assessment

Journal of Electronic Imaging(2023)

引用 0|浏览8
暂无评分
摘要
Image aesthetic assessment (IAA) is a challenging task in computer vision fields, which aims to automatically evaluate image beauty by simulating human perception on image aesthetic. With the development of deep learning, although convolutional neural network (CNN)-based IAA approaches have achieved extraordinary progress, CNN experiences difficulty to capture long-distance relationships among visual elements. There is a strong correlation between image layout and image semantic information for image aesthetic. In order to solve this problem, an another scale-guided parallel transformer is proposed, including a multiscale local feature extractor (ME), a feature projection (FP), and an another scale-guided parallel feature fusion transformer (AST). The ME captures primary local features with classic ResNet at multiple scales. The FP performs dimension transformation on feature maps for each scale, which can obtain feature token and aesthetic token. The AST with two parallel transformer encoders is exploited to highlight the significant regions in the holistic image, in which the feature tokens and the aesthetic token from another scale are grouped together to obtain interscale guidance. The final score distribution is achieved by weighting multiple aesthetic tokens with learnable parameters for unified aesthetics assessment. Extensive experiments on two public datasets, including aesthetic visual analysis and aesthetics and attributes database, demonstrate that the proposed method outperforms the state-of-the-art methods across three different tasks.
更多
查看译文
关键词
parallel transformer,image,scale-guided
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要