Semantic Graphs for Generating Deep Questions

ACL, pp. 1463-1475, 2020.

Cited by: 0|Bibtex|Views138|Links
EI
Keywords:
question answeringhoonah airportquestion generationneural question generationcontent selectionMore(6+)
Weibo:
To efficiently leverage the semantic graph for Deep Question Generation, we introduce three novel mechanisms: proposing a novel graph encoder, which incorporates an attention mechanism into the Gated Graph Neural Network, to dynamically model the interactions between different se...

Abstract:

This paper proposes the problem of Deep Question Generation (DQG), which aims to generate complex questions that require reasoning over multiple pieces of information of the input passage. In order to capture the global structure of the document and facilitate reasoning, we propose a novel framework which first constructs a semantic-lev...More
Introduction
  • Question Generation (QG) systems play a vital role in question answering (QA), dialogue system, and automated tutoring applications – by enriching the training QA corpora, helping chatbots start conversations with intriguing questions, and automatically generating assessment questions, respectively.
  • Answer: Photosynthesis a) Example of Shallow Question Generation.
  • Input Paragraph A: Pago Pago International Airport Pago Pago International Airport, known as Tafuna Airport, is a public airport located 7 miles (11.3 km) southwest of the central business district of Pago Pago, in the village and plains of Tafuna on the island of Tutuila in American Samoa, an unincorporated territory of the United States.
  • Question: Are Pago Pago International Airport and Hoonah Airport both on American territory? Answer: Yes b) Example of Deep Question Generation
Highlights
  • Question Generation (QG) systems play a vital role in question answering (QA), dialogue system, and automated tutoring applications – by enriching the training QA corpora, helping chatbots start conversations with intriguing questions, and automatically generating assessment questions, respectively
  • We propose the problem of Deep Question Generation (DQG), which aims to generate questions that require reasoning over multiple pieces of information in the passage
  • To efficiently leverage the semantic graph for Deep Question Generation, we introduce three novel mechanisms: (1) proposing a novel graph encoder, which incorporates an attention mechanism into the Gated Graph Neural Network (GGNN) (Li et al, 2016), to dynamically model the interactions between different semantic relations; (2) enhancing the word-level passage embeddings and the node-level semantic graph representations to obtain an unified semantic-aware passage representations for question decoding; and (3) introducing an auxiliary content selection task that jointly trains with question decoding, which assists the model in selecting relevant contexts in the semantic graph to form a proper reasoning chain
  • We propose the problem of Deep Question Generation to generate questions that requires reasoning over multiple disjoint pieces of information
  • We propose a novel framework which incorporates semantic graphs to enhance the input document representations and generate questions by jointly training with the task of content selection
  • Experiments on the HotpotQA dataset demonstrate that introducing semantic graph significantly reduces the semantic errors, and content selection benefits the selection and reasoning over disjoint relevant contents, leading to questions with better quality
Methods
  • As illustrated in the introduction, the semantic relations between entities serve as strong clues in determining what to ask about and the reasoning types it includes.
  • To distill such semantic information in the document, the authors explore both SRL(Semantic Role Labelling) and DP- (Dependency Parsing) based methods to construct the semantic graph.
  • The authors add intertuple edges between nodes from different tuples if they have an inclusive relationship or potentially mention the same entity
Results
  • On the HotpotQA deep-question centric dataset, the model greatly improves performance over questions requiring reasoning over multiple facts, leading to state-of-the-art performance.
  • These works are trained and evaluated on SQuAD (Rajpurkar et al, 2016), which the authors argue as insufficient to evaluate deep QG because more than 80% of its questions are shallow and only relevant to information confined to a single sentence (Du et al, 2017)
Conclusion
  • The authors propose the problem of DQG to generate questions that requires reasoning over multiple disjoint pieces of information.
  • To this end, the authors propose a novel framework which incorporates semantic graphs to enhance the input document representations and generate questions by jointly training with the task of content selection.
  • Graph structure that can accurately represent the semantic meaning of the document is crucial for the model.
  • The authors' method can be improved by explicitly modeling the reasoning chains in generation of deep questions, inspired by related methods (Lin et al, 2018; Jiang and Bansal, 2019) in multi-hop question answering
Summary
  • Introduction:

    Question Generation (QG) systems play a vital role in question answering (QA), dialogue system, and automated tutoring applications – by enriching the training QA corpora, helping chatbots start conversations with intriguing questions, and automatically generating assessment questions, respectively.
  • Answer: Photosynthesis a) Example of Shallow Question Generation.
  • Input Paragraph A: Pago Pago International Airport Pago Pago International Airport, known as Tafuna Airport, is a public airport located 7 miles (11.3 km) southwest of the central business district of Pago Pago, in the village and plains of Tafuna on the island of Tutuila in American Samoa, an unincorporated territory of the United States.
  • Question: Are Pago Pago International Airport and Hoonah Airport both on American territory? Answer: Yes b) Example of Deep Question Generation
  • Methods:

    As illustrated in the introduction, the semantic relations between entities serve as strong clues in determining what to ask about and the reasoning types it includes.
  • To distill such semantic information in the document, the authors explore both SRL(Semantic Role Labelling) and DP- (Dependency Parsing) based methods to construct the semantic graph.
  • The authors add intertuple edges between nodes from different tuples if they have an inclusive relationship or potentially mention the same entity
  • Results:

    On the HotpotQA deep-question centric dataset, the model greatly improves performance over questions requiring reasoning over multiple facts, leading to state-of-the-art performance.
  • These works are trained and evaluated on SQuAD (Rajpurkar et al, 2016), which the authors argue as insufficient to evaluate deep QG because more than 80% of its questions are shallow and only relevant to information confined to a single sentence (Du et al, 2017)
  • Conclusion:

    The authors propose the problem of DQG to generate questions that requires reasoning over multiple disjoint pieces of information.
  • To this end, the authors propose a novel framework which incorporates semantic graphs to enhance the input document representations and generate questions by jointly training with the task of content selection.
  • Graph structure that can accurately represent the semantic meaning of the document is crucial for the model.
  • The authors' method can be improved by explicitly modeling the reasoning chains in generation of deep questions, inspired by related methods (Lin et al, 2018; Jiang and Bansal, 2019) in multi-hop question answering
Tables
  • Table1: Performance comparison with baselines and the ablation study. The best performance is in bold
  • Table2: Human evaluation results for different methods on inputs with different lengths. Flu., Rel., and Cpx. denote the Fluency, Relevance, and Complexity, respectively. Each metric is rated on a 1–5 scale (5 for the best)
  • Table3: Error analysis on 3 different methods, with respects to 5 major error types (excluding the “Correct”). Pred. and G.T. show the example of the predicted question and the ground-truth question, respectively. Semantic Error: the question has logic or commonsense error; Answer Revealing: the question reveals the answer; Ghost Entity: the question refers to entities that do not occur in the document; Redundant: the question contains unnecessary repetition; Unanswerable: the question does not have the above errors but cannot be answered by the document
Download tables as Excel
Related work
  • Question generation aims to automatically generate questions from textual inputs. Rule-based techniques for QG usually rely on manually-designed rules or templates to transform a piece of given text to questions (Heilman, 2011; Chali and Hasan, 2012). These methods are confined to a variety of transformation rules or templates, making the approach difficult to generalize. Neuralbased approaches take advantage of the sequenceto-sequence (Seq2Seq) framework with attention (Bahdanau et al, 2014). These models are trained in an end-to-end manner, requiring far less labor and enabling better language flexibility, compared against rule-based methods. A comprehensive survey of QG can be found in Pan et al (2019).
Funding
  • This research is supported by the National Research Foundation, Singapore under its International Research Centres in Singapore Funding Initiative
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR, abs/1409.0473.
    Findings
  • Nicola De Cao, Wilker Aziz, and Ivan Titov. 2019. Question answering by reasoning across documents with graph convolutional networks. In Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), pages 2306–2317.
    Google ScholarLocate open access versionFindings
  • Yixin Cao, Lifu Huang, Heng Ji, Xu Chen, and Juanzi Li. 2017. Bridge text and knowledge by learning multi-prototype entity mention embedding. In Annual Meeting of the Association for Computational Linguistics (ACL), pages 1623–1633.
    Google ScholarLocate open access versionFindings
  • Yllias Chali and Sadid A. Hasan. 2012. Towards automatic topical question generation. In International Conference on Computational Linguistics (COLING), pages 475–492.
    Google ScholarLocate open access versionFindings
  • Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724–1734.
    Google ScholarLocate open access versionFindings
  • Timothy Dozat and Christopher D. Manning. 2017. Deep biaffine attention for neural dependency parsing. In International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
  • Xinya Du and Claire Cardie. 2018. Harvesting paragraph-level question-answer pairs from wikipedia. In Annual Meeting of the Association for Computational Linguistics (ACL), pages 1907– 1917.
    Google ScholarLocate open access versionFindings
  • Xinya Du, Junru Shao, and Claire Cardie. 2017. Learning to ask: Neural question generation for reading comprehension. In Annual Meeting of the Association for Computational Linguistics (ACL), pages 1342–1352.
    Google ScholarLocate open access versionFindings
  • Nan Duan, Duyu Tang, Peng Chen, and Ming Zhou. 2017. Question generation for question answering. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 866–874.
    Google ScholarLocate open access versionFindings
  • Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Annual Meeting of the Association for Computational Linguistics (ACL).
    Google ScholarLocate open access versionFindings
  • Michael Heilman. 20Automatic factual question generation from text. Language Technologies Institute School of Computer Science Carnegie Mellon University, 195.
    Google ScholarLocate open access versionFindings
  • Yichen Jiang and Mohit Bansal. 2019. Self-assembling modular networks for interpretable multi-hop reasoning. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4473– 4483.
    Google ScholarLocate open access versionFindings
  • Yanghoon Kim, Hwanhee Lee, Joongbo Shin, and Kyomin Jung. 2019. Improving neural question generation using answer separation. In AAAI Conference on Artificial Intelligence (AAAI), pages 6602–6609.
    Google ScholarLocate open access versionFindings
  • Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
  • Alon Lavie and Abhaya Agarwal. 2007. METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the Second Workshop on Statistical Machine Translation (WMT@ACL), pages 228–231.
    Google ScholarLocate open access versionFindings
  • Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard S. Zemel. 20Gated graph sequence neural networks. In International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
  • Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out.
    Google ScholarFindings
  • Xi Victoria Lin, Richard Socher, and Caiming Xiong. 20Multi-hop knowledge graph reasoning with reward shaping. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3243–3253.
    Google ScholarLocate open access versionFindings
  • Bang Liu, Mingjun Zhao, Di Niu, Kunfeng Lai, Yancheng He, Haojie Wei, and Yu Xu. 2019a. Learning to generate questions by learning what not to generate. In International World Wide Web Conference (WWW), pages 1106–1118.
    Google ScholarLocate open access versionFindings
  • Jiangming Liu, Shay B. Cohen, and Mirella Lapata. 2019b. Discourse representation parsing for sentences and documents. In Annual Meeting of the Association for Computational Linguistics (ACL), pages 6248–6262.
    Google ScholarLocate open access versionFindings
  • Lluıs Marquez, Xavier Carreras, Kenneth C. Litkowski, and Suzanne Stevenson. 2008. Semantic role labeling: An introduction to the special issue. Computational Linguistics, 34(2):145–159.
    Google ScholarLocate open access versionFindings
  • Rik van Noord, Lasha Abzianidze, Antonio Toral, and Johan Bos. 2018. Exploring neural methods for parsing discourse representation structures. Transactions of the Association for Computational Linguistics (TACL), 6:619–633.
    Google ScholarLocate open access versionFindings
  • Liangming Pan, Wenqiang Lei, Tat-Seng Chua, and Min-Yen Kan. 2019. Recent advances in neural question generation. CoRR, abs/1905.08949.
    Findings
  • Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Annual Meeting of the Association for Computational Linguistics (ACL), pages 311–318.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543.
    Google ScholarLocate open access versionFindings
  • Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong, and Richard Socher. 2019. Explain yourself! leveraging language models for commonsense reasoning. In Annual Meeting of the Association for Computational Linguistics (ACL), pages 4932– 4942.
    Google ScholarLocate open access versionFindings
  • Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100, 000+ questions for machine comprehension of text. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2383–2392.
    Google ScholarLocate open access versionFindings
  • Xingdi Yuan, Tong Wang, Caglar Gulcehre, Alessandro Sordoni, Philip Bachman, Saizheng Zhang, Sandeep Subramanian, and Adam Trischler. 2017. Machine comprehension by text-to-text neural question generation. In The 2nd Workshop on Representation Learning for NLP (Rep4NLP@ACL), pages 15–25.
    Google ScholarLocate open access versionFindings
  • Yao Zhao, Xiaochuan Ni, Yuanyuan Ding, and Qifa Ke. 2018. Paragraph-level neural question generation with maxout pointer and gated self-attention networks. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3901– 3910.
    Google ScholarLocate open access versionFindings
  • Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, and Ming Zhou. 2017. Neural question generation from text: A preliminary study. In CCF International Conference of Natural Language Processing and Chinese Computing (NLPCC), pages 662–671.
    Google ScholarLocate open access versionFindings
  • Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointergenerator networks. In Annual Meeting of the Association for Computational Linguistics (ACL), pages 1073–1083.
    Google ScholarLocate open access versionFindings
  • Peng Shi and Jimmy Lin. 2019. Simple BERT models for relation extraction and semantic role labeling. CoRR, abs/1904.05255.
    Findings
  • Xingwu Sun, Jing Liu, Yajuan Lyu, Wei He, Yanjun Ma, and Shi Wang. 2018. Answer-focused and position-aware neural question generation. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3930–3939.
    Google ScholarLocate open access versionFindings
  • Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016. Modeling coverage for neural machine translation. In Annual Meeting of the Association for Computational Linguistics (ACL).
    Google ScholarLocate open access versionFindings
  • Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. CoRR, abs/1710.10903.
    Findings
  • Xu Yang, Kaihua Tang, Hanwang Zhang, and Jianfei Cai. 2019. Auto-encoding scene graphs for image captioning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 10685– 10694.
    Google ScholarLocate open access versionFindings
  • Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2369–2380.
    Google ScholarLocate open access versionFindings
  • The primary task of semantic role labeling (SRL) is to indicate exactly what semantic relations hold among a predicate and its associated participants and properties (Marquez et al., 2008). Given a document D with n sentences {s1, · · ·, sn}, Algorithm 1 gives the detailed procedure of constructing the semantic graph based on SRL.
    Google ScholarLocate open access versionFindings
  • We first create an empty graph G = (V, E), where V and E are the node and edge sets, respectively. For each sentence s, we use the state-ofthe-art BERT-based model (Shi and Lin, 2019) provided in the AllenNLP toolkit3 to perform SRL, resulting a set of SRL tuples S. Each tuple t ∈ S consists of an argument a, a verb v, and (possibly) a modifier m, each of which is a text span of the
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments