Automated Template Generation for Question Answering over Knowledge Graphs

    WWW, pp. 1191-1200, 2017.

    Cited by: 60|Bibtex|Views46|Links
    EI
    Keywords:
    integer linear programmingrdf datumquestion answer pairquestion answeringstructured queryMore(9+)
    Wei bo:
    Templates play an important role in question answering over knowledge graphs, where user utterances are translated to structured queries via semantic parsing

    Abstract:

    Templates are an important asset for question answering over knowledge graphs, simplifying the semantic parsing of input utterances and generating structured queries for interpretable answers. State-of-the-art methods rely on hand-crafted templates with limited coverage. This paper presents QUINT, a system that automatically learns uttera...More

    Code:

    Data:

    0
    Introduction
    • Templates play an important role in question answering (QA) over knowledge graphs (KGs), where user utterances are translated to structured queries via semantic parsing [4, 35, 41].
    • Each template i) specifies how to chunk an utterance into phrases, ii) guides how these phrases map to KG primitives by specifying their semantic roles as predicates or entities, and iii) aligns syntactic structure in the utterance to the semantic predicateargument structure of the query.
    Highlights
    • Templates play an important role in question answering (QA) over knowledge graphs (KGs), where user utterances are translated to structured queries via semantic parsing [4, 35, 41]
    • Recent work has looked at a different setup combining knowledge graphs with additional textual resources for answering questions
    • Kun Xu et al [40] and Savenkov et al [29] use Wikipedia and Web search results combined with community question answering data, respectively
    • We presented a method for automatically generating templates that map a question to a triple pattern query over a knowledge graphs
    Methods
    • WebQuestions Free917 Average F 1 Accuracy

      Cai and Yates [10] (2013)

      Berant et al [4] (2013)

      Kwiatkowski et al [19] (2013)

      Yao and Van Durme [46] (2014)

      Berant and Liang [5] (2014)

      Bao et al [2](2014)

      Bordes et al [8] (2014) Yao [45] (2015)

      Dong et al [12] (2015)

      Bast and Haussmann [3] (2015).
    • WebQuestions Free917 Average F 1 Accuracy.
    • Cai and Yates [10] (2013).
    • Berant et al [4] (2013).
    • Kwiatkowski et al [19] (2013).
    • Yao and Van Durme [46] (2014).
    • Berant and Liang [5] (2014).
    • Bao et al [2](2014).
    • Bordes et al [8] (2014) Yao [45] (2015).
    • Dong et al [12] (2015).
    • Bast and Haussmann [3] (2015)
    Results
    • Kun Xu et al [40] and Savenkov et al [29] use Wikipedia and Web search results combined with community question answering data, respectively.
    • These systems achieve slightly higher F1 scores than the system with these resources (53.3 and 52.2, respectively), but the performance drops without these (47.1 and 49.4, respectively).
    • To guarantee a fair comparison, the authors ran two variants of this system; Bast and Haussmannbasic is the officially published system and Bast and Haussmann++ where the authors i) manually decompose each complex question into its constituent sub-questions, ii) answer each sub-question using Bast and Haussmann-basic, and iii) run the stitching mechanism on the answer sets of sub-questions to answer the complete question
    Conclusion
    • A detailed analysis of the results on the WebQuestions test set reveals that of the 2032 test questions, 3 could not be matched to any of the templates.
    • Another 33 test questions were matched to a template, but no query candidates were generated.
    • The authors' approach can be used to answer compositional questions despite never having seen such questions at training time
    Summary
    • Introduction:

      Templates play an important role in question answering (QA) over knowledge graphs (KGs), where user utterances are translated to structured queries via semantic parsing [4, 35, 41].
    • Each template i) specifies how to chunk an utterance into phrases, ii) guides how these phrases map to KG primitives by specifying their semantic roles as predicates or entities, and iii) aligns syntactic structure in the utterance to the semantic predicateargument structure of the query.
    • Methods:

      WebQuestions Free917 Average F 1 Accuracy

      Cai and Yates [10] (2013)

      Berant et al [4] (2013)

      Kwiatkowski et al [19] (2013)

      Yao and Van Durme [46] (2014)

      Berant and Liang [5] (2014)

      Bao et al [2](2014)

      Bordes et al [8] (2014) Yao [45] (2015)

      Dong et al [12] (2015)

      Bast and Haussmann [3] (2015).
    • WebQuestions Free917 Average F 1 Accuracy.
    • Cai and Yates [10] (2013).
    • Berant et al [4] (2013).
    • Kwiatkowski et al [19] (2013).
    • Yao and Van Durme [46] (2014).
    • Berant and Liang [5] (2014).
    • Bao et al [2](2014).
    • Bordes et al [8] (2014) Yao [45] (2015).
    • Dong et al [12] (2015).
    • Bast and Haussmann [3] (2015)
    • Results:

      Kun Xu et al [40] and Savenkov et al [29] use Wikipedia and Web search results combined with community question answering data, respectively.
    • These systems achieve slightly higher F1 scores than the system with these resources (53.3 and 52.2, respectively), but the performance drops without these (47.1 and 49.4, respectively).
    • To guarantee a fair comparison, the authors ran two variants of this system; Bast and Haussmannbasic is the officially published system and Bast and Haussmann++ where the authors i) manually decompose each complex question into its constituent sub-questions, ii) answer each sub-question using Bast and Haussmann-basic, and iii) run the stitching mechanism on the answer sets of sub-questions to answer the complete question
    • Conclusion:

      A detailed analysis of the results on the WebQuestions test set reveals that of the 2032 test questions, 3 could not be matched to any of the templates.
    • Another 33 test questions were matched to a template, but no query candidates were generated.
    • The authors' approach can be used to answer compositional questions despite never having seen such questions at training time
    Tables
    • Table1: Curated templates by Fader et al [<a class="ref-link" id="c13" href="#r13">13</a>]. Shared pred and ent annotations indicate an alignment between a phrase in the utterance template and a KG semantic item in the corresponding utterance template
    • Table2: Fragment of our lexicons: LP and LC
    • Table3: Query candidate ranking features
    • Table4: Results on the WebQuestions and Free917 test sets
    • Table5: Anecdotal results from WebQuestions for both variants of QUINT: typed and untyped
    • Table6: Sample ComplexQuestions for both QUINT and Bast and Haussmann-basic
    • Table7: Results on ComplexQuestions
    Download tables as Excel
    Related work
    • With traditional Web search over textual corpora reaching maturity, there has been a shift towards semantic search focusing on entities, and more recently on relationships. This shift was brought on by an explosion of structured [6] and semi-structured data [14].

      Entity search over KGs has gained wide attention [1, 7, 17, 32, 33]. These are keyword queries asking for entities (or other resources), and have been shown to be very common [26].

      More recent efforts have focused on natural language questions as an interface for knowledge graphs. Questions express complex relation-centric information needs more naturally, allowing for better KG utilization. They are also more natural when dealing with new modalities such as voice interaction (e.g., Cortana, Google Home). Multiple benchmarks have been introduced for this problem [4, 10, 34, 37]. These differ in the underlying KGs and supporting resources, size, and question phenomena they evoke, resulting in various solutions from those heavily relying on machine learning to more hybrid approaches using a combination of rules and hand-crafted scoring schemes. We presented experimental results for QUINT on the benchmarks introduced by Berant et al [4] and Cai and Yates [10]. We could not conduct experiments on benchmarks such as QALD [37] and BioASQ [34] due to the small size of their training sets, and because they emphasize aspects not addressed by QUINT (e.g., aggregation, ordering).
    Funding
    • For example, when the utterance “Who invented the Internet?” returns the answer Al Gore (known for funding the Internet), an explanation might tell us that it was answered from the first template of Table 1, and that the c 2017 International World Wide Web Conference Committee (IW3C2), published under Creative Commons CC BY 4.0 License
    Reference
    • K. Balog, E. Meij, and M. de Rijke. Entity search: Building bridges between two worlds. In ISS Workshop, 2010.
      Google ScholarLocate open access versionFindings
    • J. Bao, N. Duan, M. Zhou, and T. Zhao. Knowledge-based question answering as machine translation. In ACL, 2014.
      Google ScholarLocate open access versionFindings
    • H. Bast and E. Haussmann. More accurate question answering on freebase. In CIKM, 2015.
      Google ScholarLocate open access versionFindings
    • J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, 2013.
      Google ScholarFindings
    • J. Berant and P. Liang. Semantic parsing via paraphrasing. In ACL, 2014.
      Google ScholarLocate open access versionFindings
    • C. Bizer, T. Heath, and T. Berners-Lee. Linked data - the story so far. INT J SEMANT WEB INF, 5(3), 2009.
      Google ScholarLocate open access versionFindings
    • R. Blanco, P. Mika, and S. Vigna. Effective and efficient entity search in RDF data. In ISWC, 2011.
      Google ScholarLocate open access versionFindings
    • A. Bordes, S. Chopra, and J. Weston. Question answering with subgraph embeddings. In EMNLP, 2014.
      Google ScholarLocate open access versionFindings
    • L. Breiman. Random forests. Machine Learning, 2001.
      Google ScholarLocate open access versionFindings
    • Q. Cai and A. Yates. Large-scale semantic parsing via schema matching and lexicon extension. In ACL, 2013.
      Google ScholarLocate open access versionFindings
    • L. Del Corro and R. Gemulla. Clausie: clause-based open information extraction. In WWW, 2013.
      Google ScholarLocate open access versionFindings
    • L. Dong, F. Wei, M. Zhou, and K. Xu. Question answering over freebase with multi-column convolutional neural networks. In ACL, 2015.
      Google ScholarLocate open access versionFindings
    • A. Fader, L. S. Zettlemoyer, and O. Etzioni. Paraphrase-Driven Learning for Open Question Answering. In ACL, 2013.
      Google ScholarFindings
    • R. V. Guha, D. Brickley, and S. Macbeth. Schema.org: evolution of structured data on the web. Commun. ACM, 2016.
      Google ScholarLocate open access versionFindings
    • S. Hakimov, C. Unger, S. Walter, and P. Cimiano. Applying semantic parsing to question answering over linked data: Addressing the lexical gap. In NLDB, 2015.
      Google ScholarLocate open access versionFindings
    • M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In COLING, 1992.
      Google ScholarLocate open access versionFindings
    • M. Joshi, U. Sawant, and S. Chakrabarti. Knowledge graph and corpus driven segmentation and answer inference for telegraphic entity-seeking queries. In EMNLP, 2014.
      Google ScholarLocate open access versionFindings
    • D. Klein and C. D. Manning. Accurate unlexicalized parsing. In ACL, 2003.
      Google ScholarFindings
    • T. Kwiatkowski, E. Choi, Y. Artzi, and L. S. Zettlemoyer. Scaling semantic parsers with on-the-fly ontology matching. In EMNLP, 2013.
      Google ScholarFindings
    • P. Liang, M. I. Jordan, and D. Klein. Learning dependency-based compositional semantics. In ACL, 2011.
      Google ScholarLocate open access versionFindings
    • P. Liang and C. Potts. Bringing Machine Learning and Compositional Semantics Together. Annual Reviews of Linguistics, 1, 2015.
      Google ScholarLocate open access versionFindings
    • V. López, A. Nikolov, M. Sabou, V. S. Uren, E. Motta, and M. d’Aquin. Scaling up question-answering to linked data. In EKAW, 2010.
      Google ScholarLocate open access versionFindings
    • V. López, P. Tommasi, S. Kotoulas, and J. Wu. Queriodali: Question answering over dynamic and linked knowledge graphs. In ISWC, 2016.
      Google ScholarLocate open access versionFindings
    • M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Distant supervision for relation extraction without labeled data. In ACL, 2009.
      Google ScholarLocate open access versionFindings
    • S. Petrov, D. Das, and R. T. McDonald. A universal part-of-speech tagset. In LREC, 2012.
      Google ScholarLocate open access versionFindings
    • J. Pound, P. Mika, and H. Zaragoza. Ad-hoc object retrieval in the web of data. In WWW, 2010.
      Google ScholarLocate open access versionFindings
    • S. Reddy, M. Lapata, and M. Steedman. Large-scale semantic parsing without question-answer pairs. TACL, 2014.
      Google ScholarLocate open access versionFindings
    • S. Reddy, O. Täckström, M. Collins, T. Kwiatkowski, D. Das, M. Steedman, and M. Lapata. Transforming dependency structures to logical forms for semantic parsing. TACL, 2016.
      Google ScholarLocate open access versionFindings
    • D. Savenkov and E. Agichtein. When a knowledge base is not enough: Question answering over knowledge bases with external text data. In SIGIR, 2016.
      Google ScholarLocate open access versionFindings
    • S. Shekarpour, E. Marx, A. N. Ngomo, and S. Auer. SINA: semantic interpretation of user queries for question answering on interlinked data. J. Web Sem., 2015.
      Google ScholarLocate open access versionFindings
    • M. Steedman. The Syntactic Process. 2000.
      Google ScholarFindings
    • T. Tran, P. Cimiano, S. Rudolph, and R. Studer. Ontology-based interpretation of keywords for semantic search. In ISWC/ASWC, 2007.
      Google ScholarFindings
    • T. Tran, H. Wang, S. Rudolph, and P. Cimiano. Top-k exploration of query candidates for efficient keyword search on graph-shaped (RDF) data. In ICDE, 2009.
      Google ScholarLocate open access versionFindings
    • G. Tsatsaronis, M. Schroeder, G. Paliouras, Y. Almirantis, I. Androutsopoulos, E. Gaussier, P. Gallinari, T. Artieres, M. Alvers, M. Zschunke, and A.-C. Ngonga Ngomo. BioASQ: A challenge on large-scale biomedical semantic indexing and Question Answering. In AAAI, 2012.
      Google ScholarLocate open access versionFindings
    • C. Unger, L. Bühmann, J. Lehmann, A. N. Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In WWW, 2012.
      Google ScholarFindings
    • C. Unger and P. Cimiano. Pythia: Compositional meaning construction for ontology-based question answering on the semantic web. In NLDB, 2011.
      Google ScholarLocate open access versionFindings
    • C. Unger, C. Forascu, V. López, A. N. Ngomo, E. Cabrio, P. Cimiano, and S. Walter. Question answering over linked data (QALD-5). In CLEF, 2015.
      Google ScholarLocate open access versionFindings
    • R. Usbeck, A. N. Ngomo, L. Bühmann, and C. Unger. HAWK - hybrid question answering using linked data. In ESWC, 2015.
      Google ScholarLocate open access versionFindings
    • A. D. Walker, P. Alexopoulos, A. Starkey, J. Z. Pan, J. M. Gómez-Pérez, and A. Siddharthan. Answer type identification for question answering - supervised learning of dependency graph patterns from natural language questions. In JIST, 2015.
      Google ScholarLocate open access versionFindings
    • K. Xu, S. Reddy, Y. Feng, S. Huang, and D. Zhao. Question answering on freebase via relation extraction and textual evidence. In ACL, 2016.
      Google ScholarLocate open access versionFindings
    • M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In EMNLP-CoNLL, 2012.
      Google ScholarLocate open access versionFindings
    • M. Yahya, K. Berberich, S. Elbassuoni, and G. Weikum. Robust question answering over the web of linked data. In CIKM, 2013.
      Google ScholarLocate open access versionFindings
    • M. Yang, N. Duan, M. Zhou, and H. Rim. Joint relational embeddings for knowledge-based question answering. In EMNLP, 2014.
      Google ScholarLocate open access versionFindings
    • Y. Yang and M. Chang. S-MART: novel tree-based structured learning algorithms applied to tweet entity linking. In ACL, 2015.
      Google ScholarLocate open access versionFindings
    • X. Yao. Lean question answering over freebase from scratch. In NAACL, 2015.
      Google ScholarLocate open access versionFindings
    • X. Yao and B. V. Durme. Information extraction over structured data: Question answering with freebase. In ACL, 2014.
      Google ScholarLocate open access versionFindings
    • W. Yih, M. Chang, X. He, and J. Gao. Semantic parsing via staged query graph generation: Question answering with knowledge base. In ACL, 2015.
      Google ScholarLocate open access versionFindings
    • P. Yin, N. Duan, B. Kao, J. Bao, and M. Zhou. Answering questions with complex semantic constraints on open knowledge bases. In CIKM, 2015.
      Google ScholarLocate open access versionFindings
    • L. S. Zettlemoyer and M. Collins. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In UAI, 2005.
      Google ScholarLocate open access versionFindings
    • W. Zheng, L. Zou, X. Lian, J. X. Yu, S. Song, and D. Zhao. How to build templates for RDF question/answering: An uncertain graph similarity join approach. In SIGMOD, 2015.
      Google ScholarLocate open access versionFindings
    • L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD, 2014.
      Google ScholarLocate open access versionFindings
    Your rating :
    0

     

    Tags
    Comments