TextureGAN: Controlling Deep Image Synthesis with Texture Patches

Wenqi Xian
Wenqi Xian
Patsorn Sangkloy
Patsorn Sangkloy
Varun Agrawal
Varun Agrawal
Amit Raj
Amit Raj

computer vision and pattern recognition, pp. 8456-8465, 2018.

Cited by: 92|Bibtex|Views59|
Keywords:
local textureVariational Autoencoderstexture synthesistraditional 3dtexture patchMore(8+)
Weibo:
We have presented an approach for controlling deep image synthesis with input sketch and texture patches

Abstract:

In this paper, we investigate deep image synthesis guided by sketch, color, and texture. Previous image synthesis methods can be controlled by sketch and color strokes but we are the first to examine texture control. We allow a user to place a texture patch on a sketch at arbitrary locations and scales to control the desired output text...More

Code:

Data:

0
Introduction
  • One of the “Grand Challenges” of computer graphics is to allow anyone to author realistic visual content.
  • The traditional 3d rendering pipeline can produce astonishing and realistic imagery, but only in the hands of talented and trained artists.
  • The idea of short-circuiting the traditional 3d mod-.
  • In the last two years, the idea of direct image synthesis without using the traditional rendering pipeline has gotten significant interest because of promising results from deep network architectures such as Variational Autoencoders (VAEs) [21] and Generative Adversarial Networks (GANs) [11].
  • There has been little investigation of fine-grained texture control in deep image synthesis
Highlights
  • One of the “Grand Challenges” of computer graphics is to allow anyone to author realistic visual content
  • In this paper we introduce TextureGAN, the first deep image synthesis method which allows users to control object texture
  • We explore novel losses for training deep image synthesis
  • We have presented an approach for controlling deep image synthesis with input sketch and texture patches
  • TextureGAN is feed-forward which allows users to see the effect of their edits in real time
  • By training TextureGAN with local texture constraints, we demonstrate its effectiveness on sketch and texture-based image synthesis
Results
  • In Figure 4, given the input sketch, texture patch and color patch, the network trained with the complete objective function correctly propagates the color and texture to the entire handbag.
  • If the authors turn off the texture loss, the texture details within the area of the input patch are preserved, but difficult textures cannot be fully propagated to the rest of the bag.
  • The authors' ablation experiment confirms that style loss alone is not sufficient to encourage texture propagation motivating the local patchbased texture loss (Section 3.2.1)
Conclusion
  • The authors have presented an approach for controlling deep image synthesis with input sketch and texture patches.
  • With this system, a user can sketch the object structure and precisely control the generated details with texture patches.
  • TextureGAN is feed-forward which allows users to see the effect of their edits in real time.
  • By training TextureGAN with local texture constraints, the authors demonstrate its effectiveness on sketch and texture-based image synthesis.
  • The authors hope to apply the network on more complex scenes
Summary
  • Introduction:

    One of the “Grand Challenges” of computer graphics is to allow anyone to author realistic visual content.
  • The traditional 3d rendering pipeline can produce astonishing and realistic imagery, but only in the hands of talented and trained artists.
  • The idea of short-circuiting the traditional 3d mod-.
  • In the last two years, the idea of direct image synthesis without using the traditional rendering pipeline has gotten significant interest because of promising results from deep network architectures such as Variational Autoencoders (VAEs) [21] and Generative Adversarial Networks (GANs) [11].
  • There has been little investigation of fine-grained texture control in deep image synthesis
  • Results:

    In Figure 4, given the input sketch, texture patch and color patch, the network trained with the complete objective function correctly propagates the color and texture to the entire handbag.
  • If the authors turn off the texture loss, the texture details within the area of the input patch are preserved, but difficult textures cannot be fully propagated to the rest of the bag.
  • The authors' ablation experiment confirms that style loss alone is not sufficient to encourage texture propagation motivating the local patchbased texture loss (Section 3.2.1)
  • Conclusion:

    The authors have presented an approach for controlling deep image synthesis with input sketch and texture patches.
  • With this system, a user can sketch the object structure and precisely control the generated details with texture patches.
  • TextureGAN is feed-forward which allows users to see the effect of their edits in real time.
  • By training TextureGAN with local texture constraints, the authors demonstrate its effectiveness on sketch and texture-based image synthesis.
  • The authors hope to apply the network on more complex scenes
Related work
  • Image Synthesis. Synthesizing natural images has been one of the most intriguing and challenging tasks in graphics, vision, and machine learning research. Existing approaches can be grouped into non-parametric and parametric methods. On one hand, non-parametric approaches have a long-standing history. They are typically data-driven or example-based, i.e., directly exploit and borrow existing image pixels for the desired tasks [1, 3, 6, 13, 33]. Therefore, non-parametric approaches often excel at generating realistic results while having limited generalization ability, i.e., being restricted by the limitation of data and examples, e.g., data bias and incomplete coverage of long-tail distributions. On the other hand, parametric approaches, especially deep learning based approaches, have achieved promising results in recent years. Different from non-parametric methods, these approaches utilize image datasets as training data to fit deep parametric models, and have shown superior modeling power and generalization ability in image synthesis [11, 21], e.g., hallucinating diverse and relatively realistic images that are different from training data.
Funding
  • This work is supported by a Royal Thai Government Scholarship to Patsorn Sangkloy and NSF award 1561968
Reference
  • C. Barnes, E. Shechtman, A. Finkelstein, and D. Goldman. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on GraphicsTOG, 28(3):24, 2009. 2
    Google ScholarLocate open access versionFindings
  • U. Bergmann, N. Jetchev, and R. Vollgraf. Learning texture manifolds with the periodic spatial gan. arXiv preprint arXiv:1705.06566, 2017. 3
    Findings
  • T. Chen, M. ming Cheng, P. Tan, A. Shamir, and S. min Hu. Sketch2photo: internet image montage. ACM SIGGRAPH Asia, 2009. 2
    Google ScholarLocate open access versionFindings
  • A. Dosovitskiy, J. T. Springenberg, and T. Brox. Learning to generate chairs with convolutional neural networks. CoRR, abs/1411.5928, 2012
    Findings
  • A. A. Efros and W. T. Freeman. Image quilting for texture synthesis and transfer. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 341–346. ACM, 2001. 3
    Google ScholarLocate open access versionFindings
  • A. A. Efros and T. K. Leung. Texture synthesis by nonparametric sampling. In Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, volume 2, pages 1033–1038. IEEE, 1999. 2, 3
    Google ScholarLocate open access versionFindings
  • H. Fang and J. C. Hart. Textureshop: Texture synthesis as a photograph editing tool. ACM Trans. Graph., 23(3):354– 359, Aug. 2004. 3
    Google ScholarLocate open access versionFindings
  • L. Gatys, A. S. Ecker, and M. Bethge. Texture synthesis using convolutional neural networks. In Advances in Neural Information Processing Systems, pages 262–270, 2015. 3, 5
    Google ScholarLocate open access versionFindings
  • L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2414–2423, June 2016. 1, 2, 3, 5
    Google ScholarLocate open access versionFindings
  • L. A. Gatys, A. S. Ecker, M. Bethge, A. Hertzmann, and E. Shechtman. Controlling perceptual factors in neural style transfer. arXiv preprint arXiv:1611.07865, 2016. 3
    Findings
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014. 1, 2
    Google ScholarLocate open access versionFindings
  • Y. Gucluturk, U. Guclu, R. van Lier, and M. A. van Gerven. Convolutional sketch inversion. In Proceeding of the ECCV workshop on VISART Where Computer Vision Meets Art, 2016. 2, 4
    Google ScholarLocate open access versionFindings
  • J. Hays and A. A. Efros. Scene completion using millions of photographs. In ACM Transactions on Graphics (TOG), volume 26, page 4. ACM, 2007. 2
    Google ScholarLocate open access versionFindings
  • A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. H. Salesin. Image analogies. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 327–340. ACM, 2001. 3
    Google ScholarLocate open access versionFindings
  • X. Huang and S. Belongie. Arbitrary style transfer in realtime with adaptive instance normalization. arXiv preprint arXiv:1703.06868, 2017. 3
    Findings
  • S. Iizuka, E. Simo-Serra, and H. Ishikawa. Globally and Locally Consistent Image Completion. ACM Transactions on Graphics (Proc. of SIGGRAPH 2017), 36(4):107:1–107:14, 2017. 3
    Google ScholarLocate open access versionFindings
  • P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Imageto-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004, 2016. 2, 4, 6
    Findings
  • N. Jetchev, U. Bergmann, and R. Vollgraf. Texture synthesis with spatial generative adversarial networks. arXiv preprint arXiv:1611.08207, 2016. 3
    Findings
  • J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, pages 694–711. Springer, 2016. 3
    Google ScholarLocate open access versionFindings
  • D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 7
    Findings
  • D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013. 1, 2
    Findings
  • J.-F. Lalonde, D. Hoiem, A. A. Efros, C. Rother, J. Winn, and A. Criminisi. Photo clip art. ACM Transactions on Graphics (SIGGRAPH 2007), 26(3):3, August 2007. 1
    Google ScholarLocate open access versionFindings
  • G. Larsson, M. Maire, and G. Shakhnarovich. Learning representations for automatic colorization. In European Conference on Computer Vision (ECCV), 2016. 2
    Google ScholarLocate open access versionFindings
  • C. Lassner, G. Pons-Moll, and P. V. Gehler. A generative model for people in clothing. In Proceedings of the IEEE International Conference on Computer Vision, 2017. 6
    Google ScholarLocate open access versionFindings
  • C. Li and M. Wand. Precomputed real-time texture synthesis with markovian generative adversarial networks. In European Conference on Computer Vision, pages 702–716.
    Google ScholarLocate open access versionFindings
  • Y. Li, C. Fang, J. Yang, Z. Wang, X. Lu, and M.-H. Yang. Diversified texture synthesis with feed-forward networks. arXiv preprint arXiv:1703.01664, 2017. 3
    Findings
  • X. Liang, S. Liu, X. Shen, J. Yang, L. Liu, J. Dong, L. Lin, and S. Yan. Deep human parsing with active template regression. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 37(12):2402–2414, Dec 2015. 6
    Google ScholarLocate open access versionFindings
  • X. Liang, C. Xu, X. Shen, J. Yang, S. Liu, J. Tang, L. Lin, and S. Yan. Human parsing with contextualized convolutional neural network. In Proceedings of the IEEE International Conference on Computer Vision, pages 1386–1394, 2015. 6
    Google ScholarLocate open access versionFindings
  • Y. Liu, Z. Qin, Z. Luo, and H. Wang. Auto-painter: Cartoon image generation from sketch by using conditional generative adversarial networks. arXiv preprint arXiv:1705.01908, 2017. 2
    Findings
  • Z. Liu, P. Luo, S. Qiu, X. Wang, and X. Tang. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 6
    Google ScholarLocate open access versionFindings
  • Z. Liu, S. Yan, P. Luo, X. Wang, and X. Tang. Fashion landmark detection in the wild. In European Conference on Computer Vision (ECCV), 2016. 6
    Google ScholarLocate open access versionFindings
  • X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. P. Smolley. Least squares generative adversarial networks. arXiv preprint ArXiv:1611.04076, 2016. 2, 5
    Findings
  • L. McMillan and G. Bishop. Plenoptic modeling: An imagebased rendering system. In Proceedings of the 22nd annual conference on Computer graphics and interactive techniques, pages 39–46. ACM, 1995. 1, 2
    Google ScholarLocate open access versionFindings
  • A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv preprint arXiv:1610.09585, 2016. 2
    Findings
  • A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015. 2, 5
    Findings
  • S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. In Proceedings of The 33rd International Conference on Machine Learning, volume 3, 2016. 2
    Google ScholarLocate open access versionFindings
  • P. Sangkloy, J. Lu, C. Fang, F. Yu, and J. Hays. Scribbler: Controlling deep image synthesis with sketch and color. Computer Vision and Pattern Recognition, CVPR, 2017. 2, 4, 6
    Google ScholarLocate open access versionFindings
  • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations (ICLR), 2015. 4
    Google ScholarLocate open access versionFindings
  • D. Ulyanov, V. Lebedev, A. Vedaldi, and V. Lempitsky. Texture networks: Feed-forward synthesis of textures and stylized images. In Int. Conf. on Machine Learning (ICML), 2016. 3
    Google ScholarLocate open access versionFindings
  • L.-Y. Wei and M. Levoy. Fast texture synthesis using treestructured vector quantization. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 479–488. ACM Press/Addison-Wesley Publishing Co., 2000. 3
    Google ScholarLocate open access versionFindings
  • H. WinnemoLler, J. E. Kyprianidis, and S. C. Olsen. Xdog: an extended difference-of-gaussians compendium including advanced image stylization. Computers & Graphics, 36(6):740–753, 2012. 6
    Google ScholarLocate open access versionFindings
  • S. Xie and Z. Tu. Holistically-nested edge detection. In Proceedings of IEEE International Conference on Computer Vision, 2015. 6
    Google ScholarLocate open access versionFindings
  • C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li. High-resolution image inpainting using multi-scale neural patch synthesis. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. 3
    Google ScholarLocate open access versionFindings
  • D. Yoo, N. Kim, S. Park, A. S. Paek, and I. S. Kweon. Pixellevel domain transfer. In European Conference on Computer Vision, pages 517–532. Springer, 2016. 2
    Google ScholarLocate open access versionFindings
  • A. Yu and K. Grauman. Fine-grained visual comparisons with local learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 192– 199, 2014. 6
    Google ScholarLocate open access versionFindings
  • H. Zhang and K. Dana. Multi-style generative network for real-time transfer. arXiv preprint arXiv:1703.06953, 2017. 3
    Findings
  • R. Zhang, P. Isola, and A. A. Efros. Colorful image colorization. In ECCV, 2016. 2
    Google ScholarLocate open access versionFindings
  • R. Zhang, J.-Y. Zhu, P. Isola, X. Geng, A. S. Lin, T. Yu, and A. A. Efros. Real-time user-guided image colorization with learned deep priors. ACM Transactions on Graphics (TOG), 9(4), 2017. 2
    Google ScholarLocate open access versionFindings
  • J.-Y. Zhu, P. Krahenbuhl, E. Shechtman, and A. A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of European Conference on Computer Vision (ECCV), 2016. 2, 6
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments