A Structured Latent Variable Recurrent Network with Stochastic Attention for Generating Weibo Comments

IJCAI, pp. 3962-3968, 2020.

Cited by: 0|Bibtex|Views30|Links
EI
Keywords:
hierarchical structuredlatent variablestructured latent variableword leveldiscourse levelMore(16+)
Weibo:
We propose a structured latent variable Recurrent Network with Stochastic Attention, a probabilistic model that exploits both hierarchical-structured latent variables and the stochastic attention to promote multi-level diversity of comments

Abstract:

Building intelligent agents to generate realistic Weibo comments is challenging. For such realistic Weibo comments, the key criterion is improving diversity while maintaining coherency. Considering that the variability of linguistic comments arises from multi-level sources, including both discourse-level properties and word-level selectio...More

Code:

Data:

0
Introduction
  • Generating realistic comments for social applications is attaching lots of attention [Shen et al, 2019; Holtzman et al, 2019; Zhang et al, 2018].
  • As Figure 1a, traditional Seq2Seq models directly maximize the likelihood between input and output.
  • They tend to generate ”safe” and meaningless comments of high-frequency, and such comments lack.
  • The author has no hobby step t :p.
  • What is your hobby?
  • Related to response generation, generating diverse and coherent comments is important for improving the user experience of intelligent agents
Highlights
  • Generating realistic comments for social applications is attaching lots of attention [Shen et al, 2019; Holtzman et al, 2019; Zhang et al, 2018]
  • We propose a structured latent variable Recurrent Network with Stochastic Attention (SARN), a probabilistic model that exploits both hierarchical-structured latent variables and the stochastic attention to promote multi-level diversity of comments
  • We reveal inherent multi-level hierarchy in comment generation, and propose a probabilistic model that exploits latent variables and a stochastic attention
  • A series of word-level latent variables zw = {ztw}Tt=1 are introduced to characterize the variations of word-level selections, generate diverse comments by focusing on different input key-words
  • Exploiting Stochastic Gradient Variational Bayes (SGVB) [Kingma and Welling, 2013], we introduce an auxiliary distribution q to approximate the true posterior as, T
  • Experiments show that our model generates more diverse and realistic comments than other methods
Methods
  • The authors reveal inherent multi-level hierarchy in comment generation, and propose a probabilistic model that exploits latent variables and a stochastic attention.

    Hierarchical structure.
  • To induce a two-level hierarchy, the authors endow the latent variables zd and zw with structured dependencies by defining, p(y, zw, zd|x) = p(y, zw|zd, x)p.
  • Each word-level variable ztw is conditioned on x and zd, and conditioned on the previous latent trajectories of z
  • These structured dependencies promote the model capacity for capturing multi-level diversities.
Results
  • The authors sample 5 comments for each model for evaluation.
  • The authors adopt beam search in decoding process for Seq2Seq and Seq2Seq-MMI, where beam size was set to 10.
  • For VRNN, CVRNN and SARN, we
Conclusion
  • The proposed SARN exploits both hierarchical-structured latent variables and the stochastic attention to promote multi-level diversity of comments.
  • SARN is highly related to encoder-decoder models, while the main difference is that the authors inject multi-level stochastic variations in the generation process with both hierarchical and temporal dependencies.
  • Experiments show that the model generates more diverse and realistic comments than other methods
Summary
  • Introduction:

    Generating realistic comments for social applications is attaching lots of attention [Shen et al, 2019; Holtzman et al, 2019; Zhang et al, 2018].
  • As Figure 1a, traditional Seq2Seq models directly maximize the likelihood between input and output.
  • They tend to generate ”safe” and meaningless comments of high-frequency, and such comments lack.
  • The author has no hobby step t :p.
  • What is your hobby?
  • Related to response generation, generating diverse and coherent comments is important for improving the user experience of intelligent agents
  • Methods:

    The authors reveal inherent multi-level hierarchy in comment generation, and propose a probabilistic model that exploits latent variables and a stochastic attention.

    Hierarchical structure.
  • To induce a two-level hierarchy, the authors endow the latent variables zd and zw with structured dependencies by defining, p(y, zw, zd|x) = p(y, zw|zd, x)p.
  • Each word-level variable ztw is conditioned on x and zd, and conditioned on the previous latent trajectories of z
  • These structured dependencies promote the model capacity for capturing multi-level diversities.
  • Results:

    The authors sample 5 comments for each model for evaluation.
  • The authors adopt beam search in decoding process for Seq2Seq and Seq2Seq-MMI, where beam size was set to 10.
  • For VRNN, CVRNN and SARN, we
  • Conclusion:

    The proposed SARN exploits both hierarchical-structured latent variables and the stochastic attention to promote multi-level diversity of comments.
  • SARN is highly related to encoder-decoder models, while the main difference is that the authors inject multi-level stochastic variations in the generation process with both hierarchical and temporal dependencies.
  • Experiments show that the model generates more diverse and realistic comments than other methods
Tables
  • Table1: The evaluation results of human judgements, BLEU-4 score and embedding-based score of Average, Extrema and Greedy
  • Table2: The pairwise comparisons from human judgements
  • Table3: Parameter analysis for hyper-parameter α
  • Table4: Parameter analysis for hyper-parameter β
Download tables as Excel
Funding
  • This work was supported in part by the National Key R&D Program of China under Grand: 2018AAA0102003, in part by National Natural Science Foundation of China: 61771457, 61732007, 61672497, 61772494, 61931008, U1636214, 61622211, 61702491, and in part by Key Research Program of Frontier Sciences, CAS: QYZDJ-SSW-SYS013
Reference
  • [Bowman et al., 2015] Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. Generating sentences from a continuous space. Computer Science, 2015.
    Google ScholarLocate open access versionFindings
  • [Cao and Clark, 2017] Kris Cao and Stephen Clark. Latent variable dialogue models and their diversity. In Conference of the European Chapter of the Association for Computational Linguistics, pages 182–187, 2017.
    Google ScholarLocate open access versionFindings
  • [Chung et al., 2015] Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron Courville, and Yoshua Bengio. A recurrent latent variable model for sequential data. Computer Science, 35(8):1340–1353, 2015.
    Google ScholarLocate open access versionFindings
  • [Eric and Manning, 2017] Mihail Eric and Christopher D Manning. Key-value retrieval networks for task-oriented dialogue. arXiv preprint arXiv:1705.05414, 2017.
    Findings
  • [Gao et al., 2019] Xiang Gao, Sungjin Lee, Yizhe Zhang, Chris Brockett, Michel Galley, Jianfeng Gao, and Bill Dolan. Jointly optimizing diversity and relevance in neural response generation. NAACL, 2019.
    Google ScholarLocate open access versionFindings
  • [Goddeau et al., 1996] D Goddeau, H Meng, J Polifroni, and S Seneff. A form-based dialogue manager for spoken language applications. In International Conference on Spoken Language, 1996, pages 701–704 vol.2, 1996.
    Google ScholarLocate open access versionFindings
  • [Holtzman et al., 2019] Ari Holtzman, Jan Buys, Maxwell Forbes, and Yejin Choi. The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751, 2019.
    Findings
  • [Jieba, 2018] Jieba. https://pypi.org/project/jieba/.2018.
    Findings
  • [Kingma and Welling, 2013] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv, 2013.
    Google ScholarFindings
  • [Li et al., 2016] Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. Computer Science, 2016.
    Google ScholarLocate open access versionFindings
  • [Liu et al., 2018] Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, and Feng Wu. Context-aware visual policy network for sequence-level image captioning. ACM MM, 2018.
    Google ScholarLocate open access versionFindings
  • [Liu et al., 2019] Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, and Qingming Huang. Adaptive reconstruction network for weakly supervised referring expression grounding. IEEE ICCV, 2019.
    Google ScholarLocate open access versionFindings
  • [Mou et al., 2016] Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. arXiv preprint arXiv:1607.00970, 2016.
    Findings
  • [Park et al., 2018] Yookoon Park, Jaemin Cho, and Gunhee Kim. A hierarchical latent structure for variational conversation modeling. NAACL, 2018.
    Google ScholarLocate open access versionFindings
  • [Serban et al., 2017] Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron C Courville, and Yoshua Bengio. A hierarchical latent variable encoder-decoder model for generating dialogues. In AAAI, pages 3295–3301, 2017.
    Google ScholarLocate open access versionFindings
  • [Shao et al., 2017] Louis Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, and Ray Kurzweil. Generating long and diverse responses with neural conversation models. arXiv preprint arXiv:1701.03185, 2017.
    Findings
  • [Shen et al., 2017] Xiaoyu Shen, Hui Su, Yanran Li, Wenjie Li, Shuzi Niu, Yang Zhao, Akiko Aizawa, and Guoping Long. A conditional variational framework for dialog generation. arXiv preprint arXiv:1705.00316, 2017.
    Findings
  • [Shen et al., 2019] Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, and Lawrence Carin. Towards generating long and coherent text with multi-level latent variable models. ACL, 2019.
    Google ScholarLocate open access versionFindings
  • [Sutskever et al., 2014] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. 4:3104–3112, 2014.
    Google ScholarFindings
  • [Vijayakumar et al., 2016] Ashwin K Vijayakumar, Michael Cogswell, Ramprasath R Selvaraju, Qing Sun, Stefan Lee, David Crandall, and Dhruv Batra. Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424, 2016.
    Findings
  • [Xing et al., 2017] Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. Topic aware neural response generation. In AAAI, volume 17, pages 3351–3357, 2017.
    Google ScholarLocate open access versionFindings
  • [Yang et al., 2019] Tianhao Yang, Zheng-Jun Zha, and Hanwang Zhang. Making history matter: History-advantage sequence training for visual dialog. IEEE ICCV, 2019.
    Google ScholarLocate open access versionFindings
  • [Yao et al., 2016] Kaisheng Yao, Baolin Peng, Geoffrey Zweig, and Kam-Fai Wong. An attentional neural conversation model with improved specificity. arXiv preprint arXiv:1606.01292, 2016.
    Findings
  • [Zha et al., 2019] Zheng-Jun Zha, Daqing Liu, Hanwang Zhang, Yongdong Zhang, and Feng Wu. Context-aware visual policy network for fine-grained image captioning. IEEE Trans. on PAMI, 2019.
    Google ScholarLocate open access versionFindings
  • [Zhang et al., 2018] Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan, Xiujun Li, Chris Brockett, and Bill Dolan. Generating informative and diverse conversational responses via adversarial information maximization. In Advances in Neural Information Processing Systems, pages 1810–1820, 2018.
    Google ScholarLocate open access versionFindings
  • [Zhao et al., 2017] Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. arXiv, pages 654–664, 2017.
    Google ScholarFindings
  • [Zhou et al., 2017a] Ganbin Zhou, Ping Luo, Rongyu Cao, Fen Lin, Bo Chen, and Qing He. Mechanism-aware neural machine for dialogue response generation. In AAAI Conference on Artificial Intelligence, AAAI, 2017.
    Google ScholarLocate open access versionFindings
  • [Zhou et al., 2017b] Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, and Bing Liu. Emotional chatting machine: emotional conversation generation with internal and external memory. arXiv preprint arXiv:1704.01074, 2017.
    Findings
Your rating :
0

 

Tags
Comments