Variational Reasoning for Question Answering with Knowledge Graph
national conference on artificial intelligence, 2018.
neural translation modelsqa systemQuestion answeringmulti hop reasoningKnowledge graphMore(3+)
Many traditional approaches for Knowledge graph-powered Question answering are based on semantic parsers, which first map a question to formal meaning representation and translate it to a Knowledge graph query
Knowledge graph (KG) is known to be helpful for the task of question answering (QA), since it provides well-structured relational information between entities, and allows one to further infer indirect facts. However, it is challenging to build QA systems which can learn to reason over knowledge graphs based on question-answer pairs alone....More
- Question answering (QA) has been a long-standing research problem in Machine Learning and Artificial Intelligence.
- When the answer is not a direct neighbor of the topic entity in question, which requires logic reasoning over the KG, the neural approaches usually perform poorly.
- There are very few explicit annotations of the exact entity present in the question, the type of the questions, and the exact logic reasoning steps along the knowledge graph leading to the answer.
- Question answering (QA) has been a long-standing research problem in Machine Learning and Artificial Intelligence
- Thanks to the creation of large-scale knowledge graphs such as DBPedia (Auer et al 2007) and Freebase (Bollacker et al 2008), Question answering systems can be armed with well-structured knowledge on specific and open domains
- Many traditional approaches for Knowledge graph-powered Question answering are based on semantic parsers (Clarke et al 2010; Liang, Jordan, and Klein 2011; Berant et al 2013; Yih et al 2015), which first map a question to formal meaning representation and translate it to a Knowledge graph query
- With the recent success of deep learning, some end-to-end solutions based on neural networks have been proposed and show very promising performance on benchmark datasets,
- Logic reasoning on the Knowledge graph is required for multi-hop questions such as “Who have co-authored papers with ...?”
- Vanilla: Since all the topic entities are labeled, Vanilla mainly evaluates the ability of logic reasoning
- Note that fine-grained annotation is not present, such as the exact entity present in the question, question type, or the exact logic reasoning steps along the knowledge graph leading to the answer.
- A QA system with KG should be able to handle noisy entity in questions and learn multi-hop reasoning directly from question-answer pairs.
- Suppose the question is embedded using a neural network fqt(·) : q → Rd, which captures the question type and implies the type of logic reasoning the authors need to perform over knowledge graph.
- There is an existing public QA dataset named WikiMovies3, which consists of question-answer pairs in the domain of movies and provides a medium-sized knowledge graph (Miller, Fisch, and et.
- 21.1 15.3 15.3 12.1 is not able to evaluate the ability of reasoning; 2) there is no noise on the topic entity in question, so it can be located in the knowledge graph; 3) it is generated from very limited number of text templates, which is easy to be exploited by models and of limited practical value.
- Proposed Key-Value Memory Networks (KV-MemNN), and reported state-of-the-art results at that time on WikiMovies; 2) Bordes, Chopra, and Weston’s QA system tries to embed the inference subgraph for reasoning (Bordes, Chopra, and Weston 2014), but the representation is an unordered bag-of-relationships and neighbor entities; 3) the “supervised embedding” is considered as yet another baseline method, which is a simple approach but often works surprisingly well as reported in (Dodge et al 2015).
- Vanilla-EU: Without topic entity labels, all reasoning-based methods are getting worse on multi-hop questions.
- Supervised embedding gets better in this case, since it just learns to remember the pair of question and answer entities.
- Since the framework uses variational method to jointly learn the entity recognizer and reasoning graph embedding, the authors here do the model ablation to answer the following two questions: 1) is the reasoning graph embedding approach necessary for inference?
- Importance of reasoning graph embedding: As the results shown in Table 1, the proposed VRN outperforms all the other baselines, especially in 3-hop setting.
- Table1: Test results (% hits@1) on Vanilla and Vanilla-EU datasets. EU stands for entity unlabeled
- Table2: Test results (% hits@1) on NTM-EU and Audio-EU datasets. EU stands for entity unlabeled
- QA with semantic parser: Most traditional approaches for KG-powered QA are based on semantic parsers, which map the question to a certain meaning representation or logical form (Clarke et al 2010; Liang, Jordan, and Klein 2011; Kwiatkowski et al 2013; Berant et al 2013; Yih et al 2015; Marx et al 2014; Höffner et al 2016), or directly map the question to an executable program (Liang et al 2016). These approaches require domain-specific grammars, rules, or finegrained annotations. Also, they are not designed to handle noisy questions, and do not support end-to-end training since they use separate stages for question parsing and logic reasoning. Neural approaches for QA: The family of memory networks achieves state-of-the-art performance in various kinds of QA tasks. Some of them are able to do reasoning within local context (Kumar, Irsoy, and et. al. 2015; Sukhbaatar et al 2015) using attention mechanism (Yang et al 2015). For QA with KG, Miller, Fisch, and et. al. achieves state-of-the-art performance, outperforming previous works (Bordes, Chopra, and Weston 2014; Weston, Chopra, and Bordes 2014) on benchmark datasets. Recent work (Neelakantan et al 2016) uses neural programmer model for QA with single knowledge table. However, the multi-hop reasoning capability of these approaches depends on recurrent attentions and there is no explicit traversal over the KG. Graph embedding: Recently, researchers have built deep architectures to embed structured data, such as trees (Socher et al 2013b; Irsoy and Cardie 2014; Mou et al 2016) or graphs (Duvenaud et al 2015; Dai, Dai, and Song 2016; Atwood and Towsley 2016). Also some works (Li et al 2015; Johnson 2017) extend it to sequential case like multi-step reasoning. However, these approaches only work on small instances like sentences or molecules. Instead, our work embeds the reasoning-graph from source entity to every target entity in large-scale knowledge graph. Multi-hop reasoning: There are some other works on knowledge graph completion with traversal, which requires path sampling (Guu, Miller, and Liang 2015; Neelakantan, Roth, and McCallum 2015) or dynamic programming (Toutanova et al 2016). Our work can handle QA with natural language or human speech, and the reasoning-graph embeddings can represent complicated reasoning rules.
- This project was supported in part by NSF IIS-1218749, NIH BIGDATA 1R01GM108341, NSF CAREER IIS-1350983, NSF IIS-1639792 EAGER, NSF CNS-1704701, ONR N00014-15-1-2340, Intel ISTC, NVIDIA and Amazon AWS
- Atwood, J., and Towsley, D. 2016. Diffusion-convolutional neural networks. In NIPS.
- Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; and Ives, Z. 2007. Dbpedia: A nucleus for a web of open data. The semantic web.
- Berant, J.; Chou, A.; Frostig, R.; and Liang, P. 201Semantic parsing on freebase from question-answer pairs. In EMNLP.
- Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; and Taylor, J. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD.
- Bordes, A.; Chopra, S.; and Weston, J. 2014. Question answering with subgraph embeddings. arXiv preprint arXiv:1406.3676.
- Clarke, J.; Goldwasser, D.; Chang, M.-W.; and Roth, D. 2010. Driving semantic parsing from the world’s response. In Proceedings of the fourteenth conference on computational natural language learning.
- Dai, H.; Dai, B.; and Song, L. 2016. Discriminative embeddings of latent variable models for structured data. In ICML.
- Dodge, J.; Gane, A.; Zhang, X.; Bordes, A.; Chopra, S.; Miller, A.; Szlam, A.; and Weston, J. 2015. Evaluating prerequisite qualities for learning end-to-end dialog systems. arXiv.
- Dong, X.; Gabrilovich, E.; Heitz, G.; Horn, W.; Lao, N.; Murphy, K.; Strohmann, T.; Sun, S.; and Zhang, W. 2014. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In SIGKDD.
- Duvenaud, D. K.; Maclaurin, D.; Iparraguirre, J.; Bombarell, R.; Hirzel, T.; Aspuru-Guzik, A.; and Adams, R. P. 2015. Convolutional networks on graphs for learning molecular fingerprints. In NIPS.
- Guu, K.; Miller, J.; and Liang, P. 2015. Traversing knowledge graphs in vector space. arXiv preprint arXiv:1506.01094.
- He, D.; Xia, Y.; Qin, T.; Wang, L.; Yu, N.; Liu, T.; and Ma, W.-Y. 2016. Dual learning for machine translation. In NIPS.
- Höffner, K.; Walter, S.; Marx, E.; Usbeck, R.; Lehmann, J.; and Ngonga Ngomo, A.-C. 2016. Survey on challenges of question answering in the semantic web. Semantic Web (Preprint):1–26.
- Irsoy, O., and Cardie, C. 20Deep recursive neural networks for compositionality in language. In NIPS.
- Johnson, D. D. 2017. Learning graphical state transitions. In ICLR.
- Kumar, A.; Irsoy, O.; and et. al. 2015. Ask me anything: Dynamic memory networks for natural language processing. arXiv preprint arXiv:1506.07285.
- Kwiatkowski, T.; Choi, E.; Artzi, Y.; and Zettlemoyer, L. 2013. Scaling semantic parsers with on-the-fly ontology matching. In EMNLP.
- Li, Y.; Tarlow, D.; Brockschmidt, M.; and Zemel, R. 2015. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493.
- Liang, C.; Berant, J.; Le, Q.; Forbus, K. D.; and Lao, N. 2016. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. arXiv preprint arXiv:1611.00020.
- Liang, P.; Jordan, M. I.; and Klein, D. 2011. Learning dependencybased compositional semantics. In ACL.
- Marx, E.; Usbeck, R.; Ngomo, A.-C. N.; and et. al. 2014. Towards an open question answering architecture. In ICSS.
- Miller, A.; Fisch, A.; and et. al. 2016. Key-value memory networks for directly reading documents. arXiv preprint arXiv:1606.03126.
- Mnih, A., and Gregor, K. 2014. Neural variational inference and learning in belief networks. arXiv preprint arXiv:1402.0030.
- Mou, L.; Li, G.; Zhang, L.; Wang, T.; and Jin, Z. 2016. Convolutional neural networks over tree structures for programming language processing. In AAAI.
- Neelakantan, A.; Le, Q. V.; Abadi, M.; McCallum, A.; and Amodei, D. 2016. Learning a natural language interface with neural programmer. arXiv preprint arXiv:1611.08945.
- Neelakantan, A.; Roth, B.; and McCallum, A. 2015. Compositional vector space models for knowledge base completion. arXiv preprint arXiv:1504.06662.
- Pasupat, P., and Liang, P. 2015. Compositional semantic parsing on semi-structured tables. arXiv preprint arXiv:1508.00305.
- Rao, D.; McNamee, P.; and Dredze, M. 2013. Entity linking: Finding extracted entities in a knowledge base. In Multi-source, multilingual information extraction and summarization. Springer.
- Socher, R.; Chen, D.; Manning, C. D.; and Ng, A. 2013a. Reasoning with neural tensor networks for knowledge base completion. In NIPS.
- Socher, R.; Perelygin, A.; Wu, J. Y.; Chuang, J.; Manning, C. D.; Ng, A. Y.; and Potts, C. 2013b. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP.
- Sukhbaatar, S.; Weston, J.; Fergus, R.; et al. 2015. End-to-end memory networks. In NIPS.
- Toutanova, K.; Lin, X. V.; Yih, W.-t.; Poon, H.; and Quirk, C. 2016. Compositional learning of embeddings for relation paths in knowledge bases and text. In ACL.
- Weston, J.; Chopra, S.; and Bordes, A. 2014. Memory networks. arXiv preprint arXiv:1410.3916.
- Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning.
- Yang, Y., and Chang, M.-W. 2016. S-mart: Novel tree-based structured learning algorithms applied to tweet entity linking. arXiv preprint arXiv:1609.08075.
- Yang, Z.; He, X.; Gao, J.; Deng, L.; and Smola, A. 2015. Stacked attention networks for image question answering. arXiv preprint arXiv:1511.02274.
- Yih, W.-t.; Chang, M.-W.; He, X.; and Gao, J. 2015. Semantic parsing via staged query graph generation: Question answering with knowledge base. In ACL.