Answer-focused and Position-aware Neural Question Generation

Xingwu Sun
Xingwu Sun
Yajuan Lyu
Yajuan Lyu
Wei He
Wei He

EMNLP, pp. 3930-3939, 2018.

Cited by: 30|Bibtex|Views22|Links
EI
Keywords:
same question word ratioextensive experimentanswer typeaware neural question generationaverage precisionMore(16+)
Weibo:
As we discussed in Section 1, one major issue with the neural question generation model is that the generated question word does not match the answer type

Abstract:

In this paper, we focus on the problem of question generation (QG). Recent neural networkbased approaches employ the sequence-tosequence model which takes an answer and its context as input and generates a relevant question as output. However, we observe two major issues with these approaches: (1) The generated interrogative words (or que...More

Code:

Data:

0
Introduction
  • The task of question generation (QG) aims to generate questions for a given text, and it can benefit several real applications: (1) In the area of education, QG can help generate questions for reading comprehension materials (Du et al, 2017). (2) QG can enable the machine to actively ask questions in a dialogue system. (3) QG can aid in the development of question answering datasets (Duan et al, 2017).
  • (2) QG can enable the machine to actively ask questions in a dialogue system.
  • (3) QG can aid in the development of question answering datasets (Duan et al, 2017).
  • The authors focus on the sub-task of surface-form realization of questions by assuming the targets are given.
  • The recent release of large-scale machine reading comprehension datasets, e.g. SQuAD (Rajpurkar et al, 2016) and MARCO (Nguyen et al, 2016), drives the development of neural question generation
Highlights
  • The task of question generation (QG) aims to generate questions for a given text, and it can benefit several real applications: (1) In the area of education, QG can help generate questions for reading comprehension materials (Du et al, 2017). (2) QG can enable the machine to actively ask questions in a dialogue system. (3) QG can aid in the development of question answering datasets (Duan et al, 2017)
  • (2) The model copies the context words that are far from and irrelevant to the answer, instead of the words that are close and relevant to the answer. To address these two issues, we propose an answer-focused and position-aware neural question generation model
  • The baseline model copies “leading theory” that is far away from and unrelated to the answer “homologous recombination”, but neglects the phrase “second theory” that is close and relevant to the answer
  • The experimental results show that the combination of our proposed answer-focused model and positionaware model significantly improves the baseline and outperforms the state-of-the-art system
  • (2) The model copies the context words that are far from and irrelevant to the answer, instead of the words that are close and relevant to the answer, since the models are not aware the positions of the context words
  • The experimental results show that our model significantly improves the baseline and outperforms the state-of-the-art system
  • As we discussed in Section 1, one major issue with the neural question generation model is that the generated question word does not match the answer type
Methods
  • 4.1 Experiment Settings

    Dataset In this paper, the authors conduct the experiments on SQuAD and MARCO.
  • In SQuAD, there are 86, 635, 8, 965 and 8, 964 question-answer pairs in the training set, development set and test set, respectively.
  • In MARCO, there are 74, 097, 4, 539 and 4, 539 question-answer pairs in the training set, development set and test set, respectively.
  • The authors use Stanford CoreNLP 2 to extract lexical features.
  • The vocabulary contains the most frequent 20, 000 words in each training set.
  • The vocabulary of question words contains 20 words.
  • The representations of answer position feature and lexical features at the embedding layer of the encoder
Results
  • As the authors discussed in Section 1, one major issue with the neural question generation model is that the generated question word does not match the answer type.
  • It is expected that the answer-focused model can reduce such errors.
  • The answer-focused model correctly predicts the question word, though it copies wrong context words that can be further corrected by the hybrid model.
  • The outputs of the answer-focused model and the hybrid model for the case in Table 1 are as follows:
Conclusion
  • The authors find two major issues with the existing neural question generation model.
  • To tackle the two issues, the authors propose an answer-focused and position-aware model.
  • The authors further conduct extensive experiments on SQuAD and MARCO dataset.
  • The experimental results show that the combination of the proposed answer-focused model and position-aware model significantly improves the baseline and outperforms the state-of-the-art system
Summary
  • Introduction:

    The task of question generation (QG) aims to generate questions for a given text, and it can benefit several real applications: (1) In the area of education, QG can help generate questions for reading comprehension materials (Du et al, 2017). (2) QG can enable the machine to actively ask questions in a dialogue system. (3) QG can aid in the development of question answering datasets (Duan et al, 2017).
  • (2) QG can enable the machine to actively ask questions in a dialogue system.
  • (3) QG can aid in the development of question answering datasets (Duan et al, 2017).
  • The authors focus on the sub-task of surface-form realization of questions by assuming the targets are given.
  • The recent release of large-scale machine reading comprehension datasets, e.g. SQuAD (Rajpurkar et al, 2016) and MARCO (Nguyen et al, 2016), drives the development of neural question generation
  • Methods:

    4.1 Experiment Settings

    Dataset In this paper, the authors conduct the experiments on SQuAD and MARCO.
  • In SQuAD, there are 86, 635, 8, 965 and 8, 964 question-answer pairs in the training set, development set and test set, respectively.
  • In MARCO, there are 74, 097, 4, 539 and 4, 539 question-answer pairs in the training set, development set and test set, respectively.
  • The authors use Stanford CoreNLP 2 to extract lexical features.
  • The vocabulary contains the most frequent 20, 000 words in each training set.
  • The vocabulary of question words contains 20 words.
  • The representations of answer position feature and lexical features at the embedding layer of the encoder
  • Results:

    As the authors discussed in Section 1, one major issue with the neural question generation model is that the generated question word does not match the answer type.
  • It is expected that the answer-focused model can reduce such errors.
  • The answer-focused model correctly predicts the question word, though it copies wrong context words that can be further corrected by the hybrid model.
  • The outputs of the answer-focused model and the hybrid model for the case in Table 1 are as follows:
  • Conclusion:

    The authors find two major issues with the existing neural question generation model.
  • To tackle the two issues, the authors propose an answer-focused and position-aware model.
  • The authors further conduct extensive experiments on SQuAD and MARCO dataset.
  • The experimental results show that the combination of the proposed answer-focused model and position-aware model significantly improves the baseline and outperforms the state-of-the-art system
Tables
  • Table1: A bad case where the generated question word does not match the answer type. A when-question should be triggered for answer “the end of the Mexican War”, while a why-question is generated by the baseline
  • Table2: A bad case where the model copies the context words far away from and irrelevant to the answer. The baseline copies “leading theory” that is far away from and unrelated to the answer “homologous recombination”, but neglects the phrase “second theory” that is close and relevant to the answer
  • Table3: The main experimental results of baselines, answer-focused model, position-aware model and a hybrid model on SQuAD and MARCO
  • Table4: The answer-focused model has the highest same question word ratio
  • Table5: A bad case of baseline remains unresolved by applying answer-focused model because answer type is closely related to the context word “because” instead of the answer itself, but “because” is far from the answer. Thus, the encoding of answer has little memory of “because”
  • Table6: Our position-aware model can significantly improve the average precision and recall of copied OOV
Download tables as Excel
Related work
  • Question Generation Previous work of QG can be classified into two categories: rule-based and neural network-based. Regardless of the approach taken, QG usually includes two sub-tasks: (1) what to say, i.e. selecting the targets that should be asked. (2) how to say, i.e. formulating the structure of the question and producing the surface realization. This is similar to other natural language generation tasks. In this paper, we focus on the second sub-task, i.e. surface-form realization of questions by assuming the targets are given.

    The rule-based approaches usually include the following steps: (1) Preprocess the given text by applying natural language processing techniques, including syntactic parsing, sentence simplification and semantic role labeling. (2) Identify the targets that should be asked by using rules or semantic roles. (3) Generate questions using transformation rules or templates. (4) Rank the over generated questions by well-designed features (Heilman and Smith, 2009, 2010; Chali and Hasan, 2015). The major drawbacks of rule-based approaches include: (1) they rely on rules or templates that are expensive to manually create; (2) the rules or templates lack diversity; (3) the targets that they can deal with are limited.
Funding
  • This work is supported by the National Basic Research Program of China (973 program, No 2014CB340505)
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    Findings
  • Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 200Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311–318. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), pages 1532– 1543.
    Google ScholarLocate open access versionFindings
  • Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
    Findings
  • Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368.
    Findings
  • Iulian Vlad Serban, Alberto Garcıa-Duran, Caglar Gulcehre, Sungjin Ahn, Sarath Chandar, Aaron Courville, and Yoshua Bengio. 201Generating factoid questions with recurrent neural networks: The 30m factoid question-answer corpus. arXiv preprint arXiv:1603.06807.
    Findings
  • Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958.
    Google ScholarLocate open access versionFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pages 3104–3112.
    Google ScholarLocate open access versionFindings
  • Duyu Tang, Nan Duan, Tao Qin, and Ming Zhou. 2017. Question answering and question generation as dual tasks. arXiv preprint arXiv:1706.02027.
    Findings
  • Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao. 2014. Relation classification via convolutional deep neural network. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 2335–2344.
    Google ScholarLocate open access versionFindings
  • Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, and Christopher D Manning. 2017. Positionaware attention and supervised data improve slot filling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 35–45.
    Google ScholarLocate open access versionFindings
  • Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, and Ming Zhou. 2017. Neural question generation from text: A preliminary study. In National CCF Conference on Natural Language Processing and Chinese Computing, pages 662–671. Springer.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments