Open-World Knowledge Graph Completion
national conference on artificial intelligence, 2018.
EI
Keywords:
Mean Rankkgc taskabstract meaning representationsfully convolutional neural networksworld knowledgeMore(16+)
Weibo:
Abstract:
Knowledge Graphs (KGs) have been applied to many tasks including Web search, link prediction, recommendation, natural language processing, and entity linking. However, most KGs are far from complete and are growing at a rapid pace. To address these problems, Knowledge Graph Completion (KGC) has been proposed to improve KGs by filling in i...More
Code:
Data:
Introduction
- Knowledge Graphs (KGs) are a special type of information network that represents knowledge using RDF-style triples h, r, t , where h represents some head entity and r represents some relationship that connects h to some tail entity t
- In this formalism a statement like “Springfield is the capital of Illinois” can be represented as Springfield, capitalOf, Illinois.
- DBPedia, which is generated from Wikipedia’s infoboxes, contains 4.6 million entities, but half of these entities contain less than 5 relationships
- Based on this observation, researchers aim to improve the accuracy and reliability of KGs by predicting the existence of relationships.
- Continuing the example from above, suppose the relationship capitalOf is missing between Indianapolis and Indiana; the KGC task might predict this missing relationship based on the topological similarity between this part of the KG and the part containing Springfield and Illinois
Highlights
- Knowledge Graphs (KGs) are a special type of information network that represents knowledge using RDF-style triples h, r, t, where h represents some head entity and r represents some relationship that connects h to some tail entity t
- We describe an open-world Knowledge Graph Completion model called ConMask that uses relationship-dependent content masking to reduce noise in the given entity description and uses fully convolutional neural networks (FCN) to fuse related text into a relationship-dependent entity embedding
- Due to the limited text content and the redundancy found in the FB15K data set, we introduce two new data sets DBPedia50k and DBPedia500k for both open-world and closed-world Knowledge Graph Completion tasks
- In the present work we introduced a new open-world Knowledge Graph Completion model ConMask that uses relationship-dependent content masking, fully convolutional neural networks, and semantic averaging to extract relationship-dependent embeddings from the textual features of entities and relationships in Knowledge Graphs
- Experiments on both open-world and closed-world Knowledge Graph Completion tasks show that the ConMask model has good performance in both tasks
- Because of problems found in the standard Knowledge Graph Completion data sets, we released two new DBPedia data sets for Knowledge Graph Completion research and development
Methods
- The previous section described the design decisions and modelling assumptions of ConMask.
- Training parameters were set empirically but without finetuning.
- The authors set the word embedding size k = 200, maximum entity content and name length kc = kn = 512.
- The content masking window size km = 6, number of FCN layers kfcn = 3 where each layer has 2 convolutional layers and a BN layer with a moving average decay of 0.9 followed by a dropout with a keep probability p = 0.5.
Conclusion
- The authors elaborate on some actual prediction results and show examples that highlight the strengths and limitations of the ConMask model.
Table 4 shows 4 KGC examples. - Stanton did work on Star Trek, DBPedia indicates that her most notable work is The Vampire Diaries, which ranked 4th.
- The reason for this error is because the indicator word for The Vampire Diaries was “consulting producer”, which was not highly correlated to the relationship name “notable work” from the model’s perspective.Conclusion and Future Work.
- The goal for future work is to extend ConMask with the ability to find new or implicit relationships
Summary
Introduction:
Knowledge Graphs (KGs) are a special type of information network that represents knowledge using RDF-style triples h, r, t , where h represents some head entity and r represents some relationship that connects h to some tail entity t- In this formalism a statement like “Springfield is the capital of Illinois” can be represented as Springfield, capitalOf, Illinois.
- DBPedia, which is generated from Wikipedia’s infoboxes, contains 4.6 million entities, but half of these entities contain less than 5 relationships
- Based on this observation, researchers aim to improve the accuracy and reliability of KGs by predicting the existence of relationships.
- Continuing the example from above, suppose the relationship capitalOf is missing between Indianapolis and Indiana; the KGC task might predict this missing relationship based on the topological similarity between this part of the KG and the part containing Springfield and Illinois
Methods:
The previous section described the design decisions and modelling assumptions of ConMask.- Training parameters were set empirically but without finetuning.
- The authors set the word embedding size k = 200, maximum entity content and name length kc = kn = 512.
- The content masking window size km = 6, number of FCN layers kfcn = 3 where each layer has 2 convolutional layers and a BN layer with a moving average decay of 0.9 followed by a dropout with a keep probability p = 0.5.
Conclusion:
The authors elaborate on some actual prediction results and show examples that highlight the strengths and limitations of the ConMask model.
Table 4 shows 4 KGC examples.- Stanton did work on Star Trek, DBPedia indicates that her most notable work is The Vampire Diaries, which ranked 4th.
- The reason for this error is because the indicator word for The Vampire Diaries was “consulting producer”, which was not highly correlated to the relationship name “notable work” from the model’s perspective.Conclusion and Future Work.
- The goal for future work is to extend ConMask with the ability to find new or implicit relationships
Tables
- Table1: Open-world Entity prediction results on DBPedia50k and DBPedia500k. For Mean Rank (MR) lower is better. For HITS@10 and Mean Reciprocal Rank (MRR) higher is better
- Table2: Data set statistics
- Table3: Closed-world KGC on head and tail prediction. For HITS@10 higher is better. For Mean Rank (MR) lower is better
- Table4: Entity prediction results on DBPedia50k data set. Top-3 predicted tails are shown with the correct answer in bold
Reference
- Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; and Taylor, J. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD/PODS, 1247–1250. ACM.
- Bordes, A.; Usunier, N.; García-Durán, A.; Weston, J.; and Yakhnenko, O. 2013. Translating Embeddings for Modeling Multirelational Data. In NIPS, 2787–2795.
- Ceylan, I. I.; Darwiche, A.; and Van den Broeck, G. 2016. Openworld probabilistic databases. In Description Logics.
- Chen, H.; Qi, X.; Cheng, J.-Z.; and Heng, P.-A. 2016. Deep contextual networks for neuronal structure segmentation. In AAAI, 1167– 1173.
- Chorowski, J. K.; Bahdanau, D.; Serdyuk, D.; Cho, K.; and Bengio, Y. 201Attention-based models for speech recognition. In NIPS, 577–585.
- Francis-Landau, M.; Durrett, G.; and Klein, D. 201Capturing semantic similarity for entity linking with convolutional neural networks. arXiv preprint arXiv:1604.00734.
- Hachey, B.; Radford, W.; Nothman, J.; Honnibal, M.; and Curran, J. R. 2013. Evaluating entity linking with wikipedia. AI 194:130– 150.
- Huang, L.; May, J.; Pan, X.; Ji, H.; Ren, X.; Han, J.; Zhao, L.; and Hendler, J. A. 2017. Liberal entity extraction: Rapid construction of fine-grained entity typing systems. Big Data 5(1):19–31.
- Ioffe, S., and Szegedy, C. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
- Ji, H., and Grishman, R. 2011. Knowledge base population: Successful approaches and challenges. In ACL, 1148–1158.
- Kilicoglu, H.; Shin, D.; Fiszman, M.; Rosemblat, G.; and Rindflesch, T. C. 2012. SemMedDB: a PubMed-scale repository of biomedical semantic predications. Bioinformatics 28(23):3158– 3160.
- Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P. N.; Hellmann, S.; Morsey, M.; Van Kleef, P.; Auer, S.; et al. 2015. Dbpedia–a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web 6(2):167–195.
- Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; and Zhu, X. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In AAAI, 2181–2187.
- Lin, Y.; Shen, S.; Liu, Z.; Luan, H.; and Sun, M. 2016. Neural relation extraction with selective attention over instances. In ACL, 2124–2133.
- Lin, Y.; Liu, Z.; and Sun, M. 20Modeling Relation Paths for Representation Learning of Knowledge Bases. EMNLP 705–714.
- Liu, B., and Lane, I. 20Attention-based recurrent neural network models for joint intent detection and slot filling. arXiv preprint arXiv:1609.01454.
- Lukovnikov, D.; Fischer, A.; Lehmann, J.; and Auer, S. 20Neural network-based question answering over knowledge graphs on word and character level. In WWW, 1211–1220.
- Mintz, M.; Bills, S.; Snow, R.; and Jurafsky, D. 2009. Distant supervision for relation extraction without labeled data. In ACL, 1003–1011.
- Nickel, M.; Murphy, K.; Tresp, V.; and Gabrilovich, E. 2016. A review of relational machine learning for knowledge graphs. Proceedings of the IEEE 104(1):11–33.
- Pennington, J.; Socher, R.; and Manning, C. D. 2014. Glove: Global vectors for word representation. In EMNLP, volume 14, 1532–1543.
- Reiter, R. 1978. On closed world data bases. In Logic and data bases. Springer. 55–76.
- See, A.; Liu, P. J.; and Manning, C. D. 2017. Get to the point: Summarization with pointer-generator networks. ACL.
- Shi, B., and Weninger, T. 2016. Fact checking in heterogeneous information networks. In WWW, 101–102.
- Shi, B., and Weninger, T. 2017. ProjE: Embedding projection for knowledge graph completion. In AAAI.
- Socher, R.; Chen, D.; Manning, C. D.; and Ng, A. Y. 2013. Reasoning With Neural Tensor Networks for Knowledge Base Completion. In NIPS, 926–934.
- Speer, R.; Chin, J.; and Havasi, C. 2017. ConceptNet 5.5 - An Open Multilingual Graph of General Knowledge. AAAI.
- Toutanova, K., and Chen, D. 2015. Observed versus latent features for knowledge base and text inference. In 3rd Workshop on Continuous Vector Space Models and Their Compositionality. ACL.
- Wang, Z.; Zhang, J.; Feng, J.; and Chen, Z. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. AAAI 1112– 1119.
- Xie, R.; Liu, Z.; Jia, J.; Luan, H.; and Sun, M. 2016. Representation learning of knowledge graphs with entity descriptions. In AAAI, 2659–2665.
- Xu, J.; Chen, K.; Qiu, X.; and Huang, X. 2016. Knowledge graph representation with jointly structural and textual encoding. arXiv preprint arXiv:1611.08661.
- Zhang, W. 2017. Knowledge graph embedding with diversity of structures. In WWW, 747–753.
Tags
Comments