A Fully Progressive Approach to Single-Image Super-Resolution

CVPR Workshops, pp. 864-873, 2018.

Cited by: 80|Bibtex|Views85|Links
EI
Keywords:
dense compression unitsconvolutional networkupsampling factordeep neural networkimage superresolutionMore(2+)
Weibo:
In order to enable multi-scale generative adversarial network-enhanced single image super resolution, we propose a modular and progressive discriminator network similar to the generator network proposed in the previous section

Abstract:

Recent deep learning approaches to single image superresolution have achieved impressive results in terms of traditional error measures and perceptual quality. However, in each case it remains challenging to achieve high quality results for large upsampling factors. To this end, we propose a method (ProSR) that is progressive both in arch...More

Code:

Data:

0
Introduction
  • The widespread availability of high resolution displays and rapid advancements in deep learning based image pro-

    Alexander Sorkine-Hornung is at Oculus.
  • The widespread availability of high resolution displays and rapid advancements in deep learning based image pro-.
  • Approaches to single image super resolution (SISR) have achieved impressive results by learning the mapping from low-resolution (LR) to highresolution (HR) images based on data.
  • While the first class of approaches has a large memory footprint and a high computational cost, as it operates on upsampled images, the second class is more prone to checkerboard artifacts [27] due to simple concatenation of upsampling layers.
  • It remains challenging to achieve high quality results for large upsampling factors
Highlights
  • The widespread availability of high resolution displays and rapid advancements in deep learning based image pro-

    Alexander Sorkine-Hornung is at Oculus
  • In order to enable multi-scale generative adversarial network-enhanced single image super resolution, we propose a modular and progressive discriminator network similar to the generator network proposed in the previous section
  • The benchmark datasets Set5 [5], Set14 [41], BSD100 [1], Urban100 [17], and the DIV2K validation set [34] are used. As it is commonly done in single image super resolution, all evaluations are conducted on the luminance channel
  • We extend the 4-Dense Compression Unit asymmetric pyramid model to 8× upsampling to quantify the benefit of curriculum learning over simultaneous multi-scale training
  • In this work we propose a progressive approach to address single image super resolution
  • We leverage asymmetric pyramid design and Dense Compression Units in the architecture, both of which lead to improved memory efficiency and reconstruction accuracy
Results
  • Before the authors compare with popular state-of-the-art approaches, the authors first discuss the benefits of each of the proposed components using a small 24-layer model.

    All presented models are trained with the DIV2K [34] training set, which contains 800 high-resolution images.
  • The benchmark datasets Set5 [5], Set14 [41], BSD100 [1], Urban100 [17], and the DIV2K validation set [34] are used.
  • As it is commonly done in SISR, all evaluations are conducted on the luminance channel
Conclusion
  • A matching pyramidal discriminator is proposed, which enables optimizing for perceptual quality simultaneously for multiple scale.
  • The authors' models sets a new state-of-the-art benchmark in both traditional error measures and perceptual quality.
  • # params < 5M VDSR DRRN LapSRN MsLapSRN SRDenseNet ProSRs.
  • # params > 5M EDSR ProSRl S14 2× B100 U100 4× B100 U100 8× B100 U100 8× LR.
  • DRRN [33] MsLapSRN [22] EDSR [24] ProSRl (Ours).
Summary
  • Introduction:

    The widespread availability of high resolution displays and rapid advancements in deep learning based image pro-

    Alexander Sorkine-Hornung is at Oculus.
  • The widespread availability of high resolution displays and rapid advancements in deep learning based image pro-.
  • Approaches to single image super resolution (SISR) have achieved impressive results by learning the mapping from low-resolution (LR) to highresolution (HR) images based on data.
  • While the first class of approaches has a large memory footprint and a high computational cost, as it operates on upsampled images, the second class is more prone to checkerboard artifacts [27] due to simple concatenation of upsampling layers.
  • It remains challenging to achieve high quality results for large upsampling factors
  • Results:

    Before the authors compare with popular state-of-the-art approaches, the authors first discuss the benefits of each of the proposed components using a small 24-layer model.

    All presented models are trained with the DIV2K [34] training set, which contains 800 high-resolution images.
  • The benchmark datasets Set5 [5], Set14 [41], BSD100 [1], Urban100 [17], and the DIV2K validation set [34] are used.
  • As it is commonly done in SISR, all evaluations are conducted on the luminance channel
  • Conclusion:

    A matching pyramidal discriminator is proposed, which enables optimizing for perceptual quality simultaneously for multiple scale.
  • The authors' models sets a new state-of-the-art benchmark in both traditional error measures and perceptual quality.
  • # params < 5M VDSR DRRN LapSRN MsLapSRN SRDenseNet ProSRs.
  • # params > 5M EDSR ProSRl S14 2× B100 U100 4× B100 U100 8× B100 U100 8× LR.
  • DRRN [33] MsLapSRN [22] EDSR [24] ProSRl (Ours).
Tables
  • Table1: Overview of experiments in the ablation study. The introduction of DCUs, block division, an asymmetric pyramid layout, and curriculum learning allow to consistently increase reconstruction quality. Reported PSNR values refer to 4× results of Set14. The runtime is tested for 4× upscaling of 128 × 128 image
  • Table2: Gain of simultaneous training and curriculum learning w.r.t. single-scale training on all datasets. The average is computed accounting the number of images in the datasets. Curriculum learning improves the training for all scales while simultaneous training hampers the training of the lowest scale
  • Table3: Comparison with other progressive approaches
  • Table4: Comparison with state-of-the-art approaches. For clarity, we highlight the best approach in blue
Download tables as Excel
Related work
  • Single image super-resolution techniques (SISR) have been an active area of investigation for more than a decade [12]. The ill-posed nature of this problem has typically been tackled using statistical techniques: most notably image priors such as heavy-tailed gradient distributions [10, 29], gradient profiles [32], multi-scale recurrence [13], self-examples [11], and total variation [26]. In contrast, exemplar-based approaches such as nearest-neighbor [12] and sparse dictionary learning [36, 38, 40] have exploited the inherent redundancy of large-scale image datasets. Recently, Dong et al [6] showed the superiority of a simple three-layer convolutional network (CNN) over sparse coding techniques. Since then, deep convolutional architectures have consistently pushed the state-of-art forward.

    Direct vs. Progressive Reconstruction. Direct reconstruction techniques [7, 20, 23, 24, 33, 37] upscale the image to the desired spatial resolution in a single step. Early approaches [7, 20, 33] upscale the LR image in a preprocessing step. Thus, the CNN learns to deblur the input image. However, this requires the network to learn a feature representation for a high-resolution image which is computationally expensive [30]. To overcome this limitation, many approaches opt for operating on the low dimensional features and perform upsampling at the end of the network via sub-pixel convolution [30] or transposed convolution.
Funding
  • Proposes a method that is progressive both in architecture and training: the network upsamples an image in intermediate steps, while the learning process is organized from easy to hard, as is done in curriculum learning
  • Proposes a method that is progressive both in architecture and training
  • Improves the reconstruction accuracy by simplifying the information propagation within the network; proposes to use an asymmetric pyramidal structure with more layers in the lower levels to enable high upsampling ratios while remaining efficient
  • Proposes a progressive solution to learn the upscaling function u
  • Evaluates our progressive multi-scale approach against the state-of-art on a variety of datasets, wdemonstrates improved performance in terms of traditional error measures as well as perceptual quality, for larger upsampling ratios
Reference
  • P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, 33(5):898–916, 2015
    Google ScholarLocate open access versionFindings
  • M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. arXiv preprint arXiv:1701.07875, 2017. 4
    Findings
  • D. Balduzzi, M. Frean, L. Leary, J. P. Lewis, K. W.-D. Ma, and B. McWilliams. The shattered gradients problem: If resnets are the answer, then what is the question? In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 342–350, International Convention Centre, Sydney, Australia, 06–11 Aug 2017. PMLR. 4
    Google ScholarLocate open access versionFindings
  • Y. Bengio, J. Louradour, R. Collobert, and J. Weston. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41–48. ACM, 2009. 2, 4
    Google ScholarLocate open access versionFindings
  • M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. AlberiMorel. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. 2012. 5
    Google ScholarFindings
  • C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image super-resolution. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV, pages 184–199, 2014. 2
    Google ScholarLocate open access versionFindings
  • C. Dong, C. C. Loy, K. He, and X. Tang. Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • C. Dong, C. C. Loy, and X. Tang. Accelerating the superresolution convolutional neural network. In European Conference on Computer Vision, pages 391–407. Springer, 2016. 1
    Google ScholarLocate open access versionFindings
  • Y. Fan, H. Shi, J. Yu, D. Liu, W. Han, H. Yu, Z. Wang, X. Wang, and T. S. Huang. Balanced two-stage residual networks for image super-resolution. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, pages 1157–1164. IEEE, 2017. 4
    Google ScholarLocate open access versionFindings
  • C. Fernandez-Granda and E. J. Candes. Super-resolution via transform-invariant group-sparse regularization. In IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1-8, 2013, pages 3336–3343, 2013. 2
    Google ScholarLocate open access versionFindings
  • G. Freedman and R. Fattal. Image and video upscaling from local self-examples. ACM Transactions on Graphics, 30(2), 202
    Google ScholarLocate open access versionFindings
  • W. T. Freeman, T. R. Jones, and E. C. Pasztor. Examplebased super-resolution. IEEE Computer Graphics and Applications, 22(2):56–65, 2002. 2
    Google ScholarLocate open access versionFindings
  • D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27 October 4, 2009, pages 349–356, 2009. 2
    Google ScholarLocate open access versionFindings
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 202, 4
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 4
    Google ScholarLocate open access versionFindings
  • G. Huang, Z. Liu, K. Q. Weinberger, and L. van der Maaten. Densely connected convolutional networks. arXiv preprint arXiv:1608.06993, 202, 3
    Findings
  • J.-B. Huang, A. Singh, and N. Ahuja. Single image superresolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5197–5206, 2015. 5
    Google ScholarLocate open access versionFindings
  • P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Imageto-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004, 2016. 4
    Findings
  • T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017. 4
    Findings
  • J. Kim, J. Kwon Lee, and K. Mu Lee. Accurate image superresolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1646–1654, 2016. 1, 2, 5, 6
    Google ScholarLocate open access versionFindings
  • W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang. Deep laplacian pyramid networks for fast and accurate superresolution. In IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2, 3, 5, 6
    Google ScholarLocate open access versionFindings
  • W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang. Fast and accurate image super-resolution with deep laplacian pyramid networks. arXiv preprint arXiv:1710.01992, 2017. 2, 3, 5, 6, 8
    Findings
  • C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016. 1, 2, 3, 4, 7
    Findings
  • B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee. Enhanced deep residual networks for single image super-resolution. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017. 2, 4, 6, 7, 8
    Google ScholarLocate open access versionFindings
  • X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. P. Smolley. Least squares generative adversarial networks. arXiv preprint ArXiv:1611.04076, 2016. 4
    Findings
  • A. Marquina and S. Osher. Image super-resolution by tv-regularization and bregman iteration. J. Sci. Comput., 37(3):367–382, 2008. 2
    Google ScholarLocate open access versionFindings
  • A. Odena, V. Dumoulin, and C. Olah. Deconvolution and checkerboard artifacts. Distill, 2016. 1
    Google ScholarLocate open access versionFindings
  • M. S. M. Sajjadi, B. Scholkopf, and M. Hirsch. Enhancenet: Single image super-resolution through automated texture synthesis. CoRR, abs/1612.07919, 2016. 3, 4, 7
    Findings
  • Q. Shan, Z. Li, J. Jia, and C. Tang. Fast image/video upsampling. ACM Trans. Graph., 27(5):153:1–153:7, 2008. 2
    Google ScholarLocate open access versionFindings
  • W. Shi, J. Caballero, F. Huszar, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1874–1883, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 4
    Findings
  • J. Sun, Z. Xu, and H. Shum. Image super-resolution using gradient profile prior. In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 24-26 June 2008, Anchorage, Alaska, USA, 2008. 2
    Google ScholarLocate open access versionFindings
  • Y. Tai, J. Yang, and X. Liu. Image super-resolution via deep recursive residual network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 1, 2, 6, 8
    Google ScholarLocate open access versionFindings
  • R. Timofte, E. Agustsson, L. Van Gool, M.-H. Yang, L. Zhang, et al. Ntire 2017 challenge on single image superresolution: Methods and results. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017. 5
    Google ScholarLocate open access versionFindings
  • R. Timofte, S. Gu, J. Wu, L. Van Gool, L. Zhang, M.-H. Yang, et al. Ntire 2018 challenge on single image superresolution: Methods and results. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2018. 1, 7
    Google ScholarLocate open access versionFindings
  • R. Timofte, V. D. Smet, and L. J. V. Gool. Anchored neighborhood regression for fast example-based super-resolution. In IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1-8, 2013, pages 1920–1927, 2013. 2
    Google ScholarLocate open access versionFindings
  • T. Tong, G. Li, X. Liu, and Q. Gao. Image super-resolution using dense skip connections. In The IEEE International Conference on Computer Vision (ICCV), Oct 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • J. Yang, J. Wright, T. S. Huang, and Y. Ma. Image superresolution as sparse representation of raw image patches. In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 24-26 June 2008, Anchorage, Alaska, USA, 2008. 2
    Google ScholarLocate open access versionFindings
  • Z. Yang, K. Zhang, Y. Liang, and J. Wang. Single image super-resolution with a parameter economic residual-like convolutional neural network. In International Conference on Multimedia Modeling, pages 353–364. Springer, 2017. 4
    Google ScholarLocate open access versionFindings
  • R. Zeyde, M. Elad, and M. Protter. On single image scaleup using sparse-representations. In Curves and Surfaces 7th International Conference, Avignon, France, June 24-30, 2010, Revised Selected Papers, pages 711–730, 2010. 2
    Google ScholarLocate open access versionFindings
  • R. Zeyde, M. Elad, and M. Protter. On single image scale-up using sparse-representations. In International conference on curves and surfaces, pages 711–730. Springer, 2010. 5
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments