Commonsense Knowledge Aware Conversation Generation with Graph Attention

IJCAI, pp. 4623-4629, 2018.

Cited by: 100|Bibtex|Views408|Links
EI
Keywords:
semantic informationlarge scale commonsense knowledgecommonsense knowledge aware conversational modelprocessing taskconversation generationMore(13+)
Weibo:
We present a commonsense knowledge aware conversational model to demonstrate how commonsense knowledge can facilitate language understanding and generation in open-domain conversational systems

Abstract:

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge b...More

Code:

Data:

Introduction
  • When facilitated by commonsense knowledge or world facts, is essential to many natural language processing tasks [Wang et al, 2017; Lin et al, 2017], and undoubtedly, it is a key factor to the success of dialogue or conversational systems, as conversational interaction is a semantic activity [Eggins and Slade, 2005].
  • A variety of neural models has been proposed for conversation generation [Ritter et al, 2011; Shang et al, 2015]
  • These models tend to generate generic responses, which are unable to respond appropriately and informatively in most cases, because it is challenging to learn semantic interactions merely from conversational data [Ghazvininejad et al, 2017] without deep understanding of user input, and the background knowledge and the context of conversation.
Highlights
  • Semantic understanding, particularly when facilitated by commonsense knowledge or world facts, is essential to many natural language processing tasks [Wang et al, 2017; Lin et al, 2017], and undoubtedly, it is a key factor to the success of dialogue or conversational systems, as conversational interaction is a semantic activity [Eggins and Slade, 2005]
  • In open-domain conversational systems, commonsense knowledge is important for establishing effective interactions, since socially shared commonsense knowledge is the set of background information people intended to know and use during conversation [Minsky, 1991; Markovaet al., 2007; Speer and Havasi, 2012; Souto, 2015]
  • To address the two issues, we propose a commonsense knowledge aware conversational model (CCM) to facilitate language understanding and generation in open-domain conversational systems
  • Commonsense Knowledge Base ConceptNet4 is used as the commonsense knowledge base
  • We present a commonsense knowledge aware conversational model (CCM) to demonstrate how commonsense knowledge can facilitate language understanding and generation in open-domain conversational systems
  • Automatic and manual evaluation show that commonsense knowledge aware conversational model can generate more appropriate and informative responses than state-of-theart baselines
Methods
  • Commonsense Knowledge Base ConceptNet4 is used as the commonsense knowledge base
  • It contains not only world facts such as “Paris is the capital of France” that are constantly true, but informal relations between common concepts that are part of daily knowledge such as “A dog is a pet”.
Conclusion
  • Conclusion and Future

    Work

    In this paper, the authors present a commonsense knowledge aware conversational model (CCM) to demonstrate how commonsense knowledge can facilitate language understanding and generation in open-domain conversational systems.
  • Automatic and manual evaluation show that CCM can generate more appropriate and informative responses than state-of-theart baselines.
Summary
  • Introduction:

    When facilitated by commonsense knowledge or world facts, is essential to many natural language processing tasks [Wang et al, 2017; Lin et al, 2017], and undoubtedly, it is a key factor to the success of dialogue or conversational systems, as conversational interaction is a semantic activity [Eggins and Slade, 2005].
  • A variety of neural models has been proposed for conversation generation [Ritter et al, 2011; Shang et al, 2015]
  • These models tend to generate generic responses, which are unable to respond appropriately and informatively in most cases, because it is challenging to learn semantic interactions merely from conversational data [Ghazvininejad et al, 2017] without deep understanding of user input, and the background knowledge and the context of conversation.
  • Methods:

    Commonsense Knowledge Base ConceptNet4 is used as the commonsense knowledge base
  • It contains not only world facts such as “Paris is the capital of France” that are constantly true, but informal relations between common concepts that are part of daily knowledge such as “A dog is a pet”.
  • Conclusion:

    Conclusion and Future

    Work

    In this paper, the authors present a commonsense knowledge aware conversational model (CCM) to demonstrate how commonsense knowledge can facilitate language understanding and generation in open-domain conversational systems.
  • Automatic and manual evaluation show that CCM can generate more appropriate and informative responses than state-of-theart baselines.
Tables
  • Table1: Statistics of the dataset and the knowledge base
  • Table2: Automatic evaluation with perplexity (ppx.), and entity score (ent.)
  • Table3: Manual evaluation with appropriateness (app.), and informativeness (inf.). The score is the percentage that CCM wins its competitor after removing “Tie” pairs. CCM is significantly better (sign test, p-value < 0.005 ) than all the baselines on all the test sets
  • Table4: Sample responses generated by all the models
Download tables as Excel
Related work
  • Open-domain Conversational Models Recently, sequence-to-sequence models [Sutskever et al, 2014; Bahdanau et al, 2014] have been successfully applied to large-scale conversation generation, including neural responding machine [Shang et al, 2015], hierarchical recurrent models [Serban et al, 2015], and many others [Sordoni et al, 2015]. These models developed various techniques to improve the content quality of generated responses, including diversity promotion [Li et al, 2016; Shao et al, 2017], considering additional information [Xing et al, 2017; Mou et al, 2016], and handling unknown words [Gu et al, 2016]. However, generic or meaningless responses are still commonly seen in these models due to the inability of good understanding of the user input or other context. Unstructured Texts Enhanced Conversational Models

    Several studies incorporated unstructured texts as external knowledge into conversation generation [Ghazvininejad et al, 2017; Long et al, 2017]. [Ghazvininejad et al, 2017] used memory network which stores unstructured texts to improve conversation generation. [Long et al, 2017] applied a convolutional neural network to extract knowledge from unstructured texts to generate multi-turn conversations. However, these models largely depend on the quality of unstructured texts, which may introduce noise in conversation generation if the texts are irrelevant. Structured Knowledge Enhanced Conversational Models There exist some models that introduced high-quality structured knowledge for conversation generation [Han et al, 2015; Zhu et al, 2017; Xu et al, 2017]. [Xu et al, 2017] incorporated a structured domain-specific knowledge base into conversation generation with a recall-gate mechanism. [Zhu et al, 2017] presented an end-to-end knowledge grounded conversational model using a copy network [Gu et al, 2016]. However, these studies are somehow limited by the small domain-specific knowledge base, making them not applicable for open-domain, open-topic conversation generation. By contrast, our model applies a large-scale commonsense knowledge base to facilitate both the understanding of a post and the generation of a response, with novel graph attention mechanisms.
Funding
  • This work was partly supported by the National Science Foundation of China under grant No.61272227/61332007 and the National Basic Research Program (973 Program) under grant No 2013CB329403
Study subjects and analysis
pairs: 10000
The statistics can be seen in Table 1. We randomly sampled 10,000 pairs for validation. To test how commonsense knowledge can help understand common or rare concepts in a post, we constructed four test sets: highfrequency pairs in which each post has all top 25% frequent words, medium-frequency pairs where each post contains at least one word whose frequency is within the range of 25%75%, low-frequency pairs within the range of 75%-100%, and OOV pairs where each post contains out-of-vocabulary words

pairs: 5000
To test how commonsense knowledge can help understand common or rare concepts in a post, we constructed four test sets: highfrequency pairs in which each post has all top 25% frequent words, medium-frequency pairs where each post contains at least one word whose frequency is within the range of 25%75%, low-frequency pairs within the range of 75%-100%, and OOV pairs where each post contains out-of-vocabulary words. Each test set has 5,000 pairs randomly sampled from the dataset6. 4.2 Implementation Details Our model was implemented with Tensorflow7

posts: 400
4.5 Manual Evaluation. We resorted to a crowdsourcing service, Amazon Mechanical Turk, for manual annotation. 400 posts were randomly sampled for manual annotation. We conducted pair-wise comparison between the response generated by CCM and the one by

pairs: 1200
a baseline for the same post. In total, there are 1,200 pairs since we have three baselines. For each response pair, seven judges were hired to give a preference between the two responses, in terms of the following two metrics

Reference
  • [Bahdanau et al., 2014] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. CoRR, abs/1409.0473, 2014.
    Findings
  • [Bordes et al., 2013] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In NIPS, pages 2787–2795, 2013.
    Google ScholarLocate open access versionFindings
  • [Cho et al., 2014] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP, pages 1724–1734, 2014.
    Google ScholarLocate open access versionFindings
  • [Eggins and Slade, 2005] Suzanne Eggins and Diana Slade. Analysing casual conversation. Equinox Publishing Ltd., 2005.
    Google ScholarFindings
  • [Ghazvininejad et al., 2017] Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. A knowledge-grounded neural conversation model. CoRR, abs/1702.01932, 2017.
    Findings
  • [Gu et al., 2016] Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. Incorporating copying mechanism in sequence-to-sequence learning. In ACL, pages 1631–1640, 2016.
    Google ScholarLocate open access versionFindings
  • [Han et al., 2015] Sangdo Han, Jeesoo Bang, Seonghan Ryu, and Gary Geunbae Lee. Exploiting knowledge base to generate responses for natural language dialog listening agents. In SIGDIAL, pages 129–133, 2015.
    Google ScholarLocate open access versionFindings
  • [Li et al., 2016] Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. In NAACL, pages 110–119, 2016.
    Google ScholarLocate open access versionFindings
  • [Lin et al., 2017] Hongyu Lin, Le Sun, and Xianpei Han. Reasoning with heterogeneous knowledge for commonsense machine comprehension. In EMNLP, pages 2032– 2043, 2017.
    Google ScholarLocate open access versionFindings
  • [Long et al., 2017] Yinong Long, Jianan Wang, Zhen Xu, Zongsheng Wang, Baoxun Wang, and Zhuoran Wang. A knowledge enhanced generative conversational service agent. In DSTC6 Workshop, 2017.
    Google ScholarLocate open access versionFindings
  • [Markovaet al., 2007] Ivana Markova, Per Linell, Michele Grossen, and Anne Salazar Orvig. Dialogue in focus groups: Exploring socially shared knowledge. Equinox publishing, 2007.
    Google ScholarFindings
  • [Minsky, 1991] Marvin Minsky. Society of mind: a response to four reviews. Artificial Intelligence, 48(3):371–396, 1991.
    Google ScholarLocate open access versionFindings
  • [Mou et al., 2016] Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, and Zhi Jin. Sequence to backward and forward sequences: A content-introducing approach to generative short-text conversation. In COLING, pages 3349–3358, 2016.
    Google ScholarLocate open access versionFindings
  • [Ritter et al., 2011] Alan Ritter, Colin Cherry, and William B. Dolan. Data-driven response generation in social media. In EMNLP, pages 583–593, 2011.
    Google ScholarLocate open access versionFindings
  • [Serban et al., 2015] Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. Hierarchical neural network generative models for movie dialogues. CoRR, abs/1507.04808, 2015.
    Findings
  • [Shang et al., 2015] Lifeng Shang, Zhengdong Lu, and Hang Li. Neural responding machine for short-text conversation. In ACL, pages 1577–1586, 2015.
    Google ScholarLocate open access versionFindings
  • [Shao et al., 2017] Louis Shao, Stephan Gouws, Denny Britz, Anna Goldie, Brian Strope, and Ray Kurzweil. Generating long and diverse responses with neural conversation models. CoRR, abs/1701.03185, 2017.
    Findings
  • [Sordoni et al., 2015] Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. A neural network approach to context-sensitive generation of conversational responses. In NAACL, pages 196–205, 2015.
    Google ScholarLocate open access versionFindings
  • [Souto, 2015] Patrıcia Cristina Nascimento Souto. Creating knowledge with and from the differences: the required dialogicality and dialogical competences. RAI Revista de Administracao e Inovacao, 12(2):60–89, 2015.
    Google ScholarLocate open access versionFindings
  • [Speer and Havasi, 2012] Robert Speer and Catherine Havasi. Representing general relational knowledge in conceptnet 5. In LREC, pages 3679–3686, 2012.
    Google ScholarLocate open access versionFindings
  • [Sutskever et al., 2014] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104–3112, 2014.
    Google ScholarLocate open access versionFindings
  • [Velickovic et al., 2017] Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. CoRR, abs/1710.10903, 2017.
    Findings
  • [Wang et al., 2017] Bingning Wang, Kang Liu, and Jun Zhao. Conditional generative adversarial networks for commonsense machine comprehension. In IJCAI, pages 4123–4129, 2017.
    Google ScholarLocate open access versionFindings
  • [Xing et al., 2017] Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. Topic aware neural response generation. In AAAI, pages 3351–3357, 2017.
    Google ScholarLocate open access versionFindings
  • [Xu et al., 2017] Zhen Xu, Bingquan Liu, Baoxun Wang, Chengjie Sun, and Xiaolong Wang. Incorporating loosestructured knowledge into conversation modeling via recall-gate lstm. In IJCNN, pages 3506–3513. IEEE, 2017.
    Google ScholarLocate open access versionFindings
  • [Zhu et al., 2017] Wenya Zhu, Kaixiang Mo, Yu Zhang, Zhangbin Zhu, Xuezheng Peng, and Qiang Yang. Flexible end-to-end dialogue system for knowledge grounded conversation. CoRR, abs/1709.04264, 2017.
    Findings
Your rating :
0

 

Best Paper
Best Paper of IJCAI, 2018
Tags
Comments