Fusion that matters: convolutional fusion networks for visual recognition

Multimedia Tools Appl.(2018)

引用 19|浏览52
暂无评分
摘要
In recent years, deep learning has been successfully applied to diverse multimedia research areas, with the aim of learning powerful and informative representations for a variety of visual recognition tasks. In this work, we propose convolutional fusion networks (CFN) to integrate multi-level deep features and fuse a richer visual representation. Despite recent advances in deep fusion networks, they still have limitations due to expensive parameters and weak fusion modules. Instead, CFN uses 1 × 1 convolutional layers and global average pooling to generate side branches with few parameters, and employs a locally-connected fusion module, which can learn adaptive weights for different side branches and form a better fused feature. Specifically, we introduce three key components of the proposed CFN, and discuss its differences from other deep models. Moreover, we propose fully convolutional fusion networks (FCFN) that are an extension of CFN for pixel-level classification applied to several tasks, such as semantic segmentation and edge detection. Our experiments demonstrate that CFN (and FCFN) can achieve promising performance by consistent improvements for both image-level and pixel-level classification tasks, compared to a plain CNN. We release our codes on https://github.com/yuLiu24/CFN . Also, we make a live demo ( goliath.liacs.nl ) using a CFN model trained on the ImageNet dataset.
更多
查看译文
关键词
Deep fusion networks,Locally-connected fusion,Image-level recognition,Pixel-level recognition,Transferring deep features
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要