# A Review of Relational Machine Learning for Knowledge Graphs

Proceedings of the IEEE, Volume 104, Issue 1, 2015, Pages 11-33.

EI WOS

Keywords:

Graph-based modelsknowledge extractionknowledge graphslatent feature modelsstatistical relational learning

Wei bo:

Abstract:

Relational machine learning studies methods for the statistical analysis of relational, or graph-structured, data. In this paper, we provide a review of how such statistical models can be “trained” on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In p...More

Code:

Data:

Introduction

- ‘‘The author is convinced that the crux of the problem of these graphs contain millions of nodes and billions of learning is recognizing relationships and being able edges.
- This causes them to focus on scalable SRL techniques, to use them’’VChristopher Strachey in a letter to Alan which take time that is linear in the size of Turing, 1954.

Highlights

- YAGO [4], DBpedia [5], NELL [6], Freebase [7], and the Google Knowledge Graph [8]
- Relational Learning Results: Path Ranking Algorithm has been shown to outperform the inductive logic programming method FOIL [106] for link prediction in NELL [116]
- It has been shown to have comparable performance to ER-multilayer perceptrons on link prediction in knowledge vault: Path Ranking Algorithm obtained a result of 0.884 for the area under the ROC curve, as compared to 0.882 for ER-multilayer perceptrons [28]
- We provided a review of state-of-the-art statistical relational learning (SRL) methods applied to very large knowledge graphs
- We demonstrated how statistical relational learning can be used in conjunction with machine reading and information extraction methods to automatically build such knowledge repositories
- These knowledge graphs are impressive in their size, they still fall short of representing many kinds of knowledge that humans possess

Results

- RESCAL has been shown to achieve state-of-the-art results on a number of relational learning tasks.
- [63] showed that RESCAL provides comparable or better relationship prediction results on a number of small benchmark data sets compared to Markov logic networks [70], the infinite relational model [71], [72], and Bayesian clustered tensor factorization [73].
- Latent feature models are well-suited for modeling global relational patterns via newly introduced latent variables
- They are computationally efficient if triples can be explained with a small number of latent variables

Conclusion

**CONCLUDING REMARKS**

Knowledge graphs (KGs) have found important applications in question answering, structured search, exploratory search, and digital assistants.- The authors provided a review of state-of-the-art statistical relational learning (SRL) methods applied to very large knowledge graphs.
- The authors showed how to create a truly massive, machine-interpretable ‘‘semantic memory’’ of facts, which is already empowering numerous practical applications.
- These KGs are impressive in their size, they still fall short of representing many kinds of knowledge that humans possess.
- Representing, learning, and reasoning with these kinds of knowledge remains the frontier for AI and machine learning. h

Summary

- YAGO [4], DBpedia [5], NELL [6], Freebase [7], and the Google Knowledge Graph [8]. As we discuss in Section II,

‘‘I am convinced that the crux of the problem of these graphs contain millions of nodes and billions of learning is recognizing relationships and being able edges. - We model each possible triple xijk 1⁄4 ðei; rk; ejÞ over this set of entities and relations as a binary random variable yijk 2 f0; 1g that indicates its existence.
- We discuss how statistical relational learning can be applied to knowledge graphs.
- A relational model for large-scale knowledge graphs should scale at most linearly with the data size, i.e., linearly in the number of entities
- Once the parameters have been estimated, the computational complexity to predict the score of a triple depends only on the number of latent features and is independent of the size of the graph.
- [63] showed that RESCAL provides comparable or better relationship prediction results on a number of small benchmark data sets compared to Markov logic networks [70], the infinite relational model [71], [72], and Bayesian clustered tensor factorization [73].
- Jenatton et al [81] proposed a tensor factorization model for knowledge graphs with a very large number of different relations.
- RESCAL represents pairs of entities ðei; ejÞ via the tensor product of their latent feature representations (5) and predicts the existence of the triple xijk from Fij via wk (4).
- A. Similarity Measures for Unirelational Data Observable graph feature models are widely used for link prediction in graphs that consist only of a single relation, e.g., social network analysis, biology, and Web mining.
- B. Rule Mining and Inductive Logic Programming Another class of models that works on the observed variables of a knowledge graph extracts rules via mining methods and uses these extracted rules to infer new links.
- A good example is the marriedTo relation: One marriage corresponds to a single strongly connected component, so data with a large number of marriages would be difficult to model with RLFMs. predicting marriedTo links via graph-based models is easy: the existence of the triple (John, marriedTo, Mary) can be predicted from the existence of (Mary, marriedTo, John), by exploiting the symmetry of the relation.
- An alternative approach to generate negative examples is to exploit known constraints on the structure of a knowledge graph: Type constraints for predicates, valid value ranges for attributes, or functional constraints such as mutual exclusion can all be used for this purpose.
- A common approach is to use Markov logic [126], which is a template language based on logical formulae: Given a set of formulae F 1⁄4 fFigLi1⁄41, we create an edge between nodes in the dependency graph if the corresponding facts occur in at least one grounded formula.
- Representing, learning, and reasoning with these kinds of knowledge remains the frontier for AI and machine learning. h

- Table1: Knowledge Base Construction Projects
- Table2: Size of Some Schema-Based Knowledge Bases
- Table3: Summary of the Notation
- Table4: Semantic Embeddings of KV-MLP on Freebase
- Table5: Summary of the Latent Feature Models. ha, hb, and hc Are Hidden Layers of the Neural Network; See Text for Details
- Table6: Examples of Paths Learned by PRA on Freebase to Predict Which College a Person Attended

Funding

- Nickel was supported by the Center for Brains, Minds and Machines (CBMM) under NSF STC award CCF-1231216
- Tresp was supported by the German Federal Ministry for Economic Affairs and Energy under the ‘‘Smart Data’’ technology program (Grant 01MT14001)

Study subjects and analysis

male: 10306

In Section VII, we will furthermore discuss aspects of how to train these models on knowledge graphs. 8As an example, there are currently 10306 male and 7586 female American actors listed in Wikipedia, while there are only 1268 male and 1354 female Indian, and 77 male and no female Nigerian actors. India and Nigeria, however, are the largest and second largest film industries in the world

adults: 3

To explain this further, consider a KG involving two types of entities, adults and children, and two types of relations, parentOf and marriedTo. Fig. 6(a) depicts a sample KG with three adults and one child. Obviously, these relations (edges) are correlated, since people who share a common child are often married, while people rarely marry their own children

Reference

- L. Getoor and B. Taskar, Eds., Introduction to Statistical Relational Learning. Cambride, MA, USA: MIT Press, 2007.
- S. Dzeroski and N. Lavrac, Relational Data Mining. New York, NY, USA: Springer-Verlag, 2001.
- L. De Raedt, Logical and Relational Learning. New York, NY, USA: Springer-Verlag, 2008.
- F. M. Suchanek, G. Kasneci, and G. Weikum, ‘‘Yago: A core of semantic knowledge,’’ in Proc. 16th Int. Conf. World Wide Web, 2007, pp. 697–706.
- S. Auer et al., ‘‘DBpedia: A nucleus for a web of open data,’’ in The Semantic Web, vol. 482Berlin, Germany: SpringerVerlag, 2007, pp. 722–735.
- A. Carlson et al., ‘‘Toward an architecture for never-ending language learning,’’ in Proc. 24th Conf. Artif. Intell., 2010, pp. 1306–1313.
- K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor, ‘‘Freebase: A collaboratively created graph database for structuring human knowledge,’’ in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2008, pp. 1247–1250.
- A. Singhal, ‘‘Introducing the knowledge graph: Things, not strings,’’ May 2012. [Online]. Available: http://googleblog.blogspot.com/2012/05/introducingknowledge-graph-things-not.html.
- G. Weikum and M. Theobald, ‘‘From information to knowledge: Harvesting entities and relationships from web sources,’’ in Proc. 29th ACM SIGMOD-SIGACT-SIGART Symp. Principles Database Syst., 2010, pp. 65–76.
- J. Fan et al., ‘‘AKBC-WEKEX 2012: The knowledge extraction workshop at NAACL-HLT,’’ 2012. [Online]. Available: https://akbcwekex2012.wordpress.com/.
- R. Davis, H. Shrobe, and P. Szolovits, ‘‘What is a knowledge representation?’’ AI Mag., vol. 14, no. 1, pp. 17–33, 1993.
- J. F. Sowa, ‘‘Semantic networks,’’ Encyclopedia Cogn. Sci., 2006.
- M. Minsky, ‘‘A framework for representing knowledge,’’ MIT-AI Lab. Memo 306, 1974.
- T. Berners-Lee, J. Hendler, and O. Lassila, ‘‘The semantic web,’’ 2001. [Online]. Available: http://www.scientificamerican.com/article/the-semantic-web/.
- T. Berners-Lee, ‘‘Linked dataVDesign issues,’’ Jul. 2006. [Online]. Available: http://www.w3.org/DesignIssues/ LinkedData.html.
- C.Bizer,T.Heath,andT.Berners-Lee,‘‘Linked data-the story so far,’’ Int.J.Semantic Web Inf. Syst., vol. 5, no. 3, pp. 1–22, 2009.
- G. Klyne and J. J. Carroll, ‘‘Resource description framework (RDF): Concepts and abstract syntax,’’ Feb. 2004. [Online]. Available: http://www.w3.org/TR/2004/ REC-rdf-concepts-20040210/.
- R. Cyganiak, D. Wood, and M. Lanthaler, ‘‘RDF 1.1 Concepts and abstract syntax,’’ Feb. 2014. [Online]. Available: http://www.w3.org/TR/2014/REC-rdf11-concepts20140225/.
- R. Brachman and H. Levesque, Knowledge Representation and Reasoning. San Francisco, CA, USA: Morgan Kaufmann, 2004.
- J. F. Sowa, Knowledge Representation: Logical, Philosophical and Computational Foundations. Pacific Grove, CA, USA: Brooks/Cole, 2000.
- Y. Sun and J. Han, ‘‘Mining heterogeneous information networks: Principles and methodologies,’’ Synthesis Lectures Data Mining Knowl. Disc., vol. 3, no. 2, pp. 1–159, 2012.
- R. West et al., ‘‘Knowledge base completion via search-based question answering,’’ in Proc. 23rd Int. Conf. World Wide Web, 2014, pp. 515–526.
- D. B. Lenat, ‘‘CYC: A large-scale investment in knowledge infrastructure,’’ Commun. ACM, vol. 38, no. 11, pp. 33–38, Nov. 1995.
- G. A. Miller, ‘‘WordNet: A lexical database for english,’’ Commun. ACM, vol. 38, no. 11, pp. 39–41, Nov. 1995.
- O. Bodenreider, ‘‘The Unified Medical Language System (UMLS): Integrating biomedical terminology,’’ Nucleic Acids Res., vol. 32, no. Database issue, pp. D267–270, Jan. 2004.
- D. Vrandecic and M. Krotzsch, ‘‘Wikidata: A free collaborative knowledgebase,’’ Commun. ACM, vol. 57, no. 10, pp. 78–85, 2014.
- J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum, ‘‘YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia,’’ Artif. Intell., vol. 194, pp. 28–61, 2013.
- X. Dong et al., ‘‘Knowledge vault: A webscale approach to probabilistic knowledge fusion,’’ in Proc. 20th ACM SIGKDD Int. Conf. Knowl. Disc. Data Mining, 2014, pp. 601–610.
- N. Nakashole, G. Weikum, and F. Suchanek, ‘‘PATTY: A taxonomy of relational patterns with semantic types,’’ in Proc. Joint Conf. Empirical Meth. Natural Language Process. Computat. Natural Language Learning, 2012, pp. 1135–1145.
- N. Nakashole, M. Theobald, and G. Weikum, ‘‘Scalable knowledge harvesting with high precision and high recall,’’ in Proc. 4th ACM Int. Conf. Web Search Data Mining, 2011, pp. 227–236.
- F. Niu, C. Zhang, C. Re, and J. Shavlik, ‘‘Elementary: Large-scale knowledge-base construction via machine learning and statistical inference,’’ Int. J. Semantic Web Inf. Syst. (IJSWIS), vol. 8, no. 3, pp. 42–73, 2012.
- A. Fader, S. Soderland, and O. Etzioni, ‘‘Identifying relations for open information extraction,’’ in Proc. Conf. Empir. Meth. Natural Language Process., Stroudsburg, PA, USA, 2011, pp. 1535–1545.
- M. Schmitz, R. Bart, S. Soderland, and O. Etzioni, ‘‘Open language learning for information extraction,’’ in Proc. Joint Conf. Empirical Meth. Natural Language Process. Computat. Natural Language Learning, 2012, pp. 523–534.
- J. Fan, D. Ferrucci, D. Gondek, and A. Kalyanpur, ‘‘Prismatic: Inducing knowledge from a large scale lexicalized relation resource,’’ in Proc. NAACL HLT 1st Int. Workshop Formalisms Method. Learning Reading, 2010, pp. 122–127.
- B. Suh, G. Convertino, E. H. Chi, and P. Pirolli, ‘‘The singularity is not near: Slowing growth of Wikipedia,’’ in Proc. ACM 5th Int. Symp. Wikis Open Collab., 2009, pp. 8:1–8:10.
- J. Biega, E. Kuzey, and F. M. Suchanek, ‘‘Inside YAGO2s: A transparent information extraction architecture,’’ in Proc. 22nd Int. Conf. World Wide Web, Republic and Canton of Geneva, Switzerland, 2013, pp. 325–328.
- O. Etzioni, A. Fader, J. Christensen, S. Soderland, and M. Mausam, ‘‘Open information extraction: The second generation,’’ in Proc. 22nd Int. Joint Conf. Artif. Intell., Barcelona, Catalonia, Spain, 2011, vol. 1, pp. 3–10.
- D. B. Lenat and E. A. Feigenbaum, ‘‘On the thresholds of knowledge,’’ Artif. Intell., vol. 47, no. 1, pp. 185–250, 1991.
- R. Qian, ‘‘Understand your world with Bing,’’ Bing search blog, Mar. 2013. [Online]. Available: http://blogs.bing.com/search/2013/03/21/understand-your-world-withbing/.
- D. Ferrucci et al., ‘‘Building Watson: An overview of the DeepQA project,’’ AI Mag., vol. 31, no. 3, pp. 59–79, 2010.
- F. Belleau, M.-A. Nolin, N. Tourigny, P. Rigault, and J. Morissette, ‘‘Bio2RDF: Towards a mashup to build bioinformatics knowledge systems,’’ J. Biomed. Inf., vol. 41, no. 5, pp. 706–716, 2008.
- A. Ruttenberg, J. A. Rees, M. Samwald, and M. S. Marshall, ‘‘Life sciences on the semantic web: The neurocommons and beyond,’’ Brief. Bioinf., vol. 10, no. 2, pp. 193–204, Mar. 2009.
- V. Momtchev, D. Peychev, T. Primov, and G. Georgiev, ‘‘Expanding the pathway and interaction knowledge in linked life data,’’ in Proc. Int. Semantic Web Challenge, 2009.
- G. Angeli and C. Manning, ‘‘Philosophers are mortal: Inferring the truth of unseen facts,’’ in Proc. 17th Conf. Comput. Natural Language Learn., Sofia, Bulgaria, Aug. 2013, pp. 133–142.
- B. Taskar, M.-F. Wong, P. Abbeel, and D. Koller, ‘‘Link prediction in relational data,’’ in Adv. Neural Inf. Process. Syst., vol. 16, S. Thrun, L. Saul, and B. Scholkopf, Eds. Cambridge, MA, USA: MIT Press, 2004.
- L. Getoor and C. P. Diehl, ‘‘Link mining: A survey,’’ ACM SIGKDD Explor. Newslett., vol. 7, no. 2, pp. 3–12, 2005.
- H. B. Newcombe, J. M. Kennedy, S. J. Axford, and A. P. James, ‘‘Automatic linkage of vital records computers can be used to extract ‘‘follow-up’’ statistics of families from files of routine records,’’ Science, vol. 130, no. 3381, pp. 954–959, Oct. 1959.
- S. Tejada, C. A. Knoblock, and S. Minton, ‘‘Learning object identification rules for information integration,’’ Inf. Syst., vol. 26, no. 8, pp. 607–633, 2001.
- E. Rahm and P. A. Bernstein, ‘‘A survey of approaches to automatic schema matching,’’ VLDB J., vol. 10, no. 4, pp. 334–350, 2001.
- A. Culotta and A. McCallum, ‘‘Joint deduplication of multiple record types in relational data,’’ in Proc. 14th ACM Int. Conf. Inf. Knowl. Manage., 2005, pp. 257–258.
- P. Singla and P. Domingos, ‘‘Entity resolution with Markov logic,’’ in Proc. 6th Int. Conf. Data Mining, Dec. 2006, pp. 572–582.
- I. Bhattacharya and L. Getoor, ‘‘Collective entity resolution in relational data,’’ ACM Trans. Knowl. Discov. Data, vol. 1, no. 1, Mar. 2007.
- S. E. Whang and H. Garcia-Molina, ‘‘Joint entity resolution,’’ in Proc. IEEE 28th Int. Conf. Data Eng., Washington, DC, USA, 2012, pp. 294–305.
- S. Fortunato, ‘‘Community detection in graphs,’’ Phys. Rep., vol. 486, no. 3, pp. 75–174, 2010.
- M. E. J. Newman, ‘‘The structure of scientific collaboration networks,’’ Proc. Nat. Acad. Sci., vol. 98, no. 2, pp. 404–409, Jan. 2001.
- D. Liben-Nowell and J. Kleinberg, ‘‘The link-prediction problem for social networks,’’ J. Amer. Soc. Inf. Sci. Technol., vol. 58, no. 7, pp. 1019–1031, 2007.
- D. Jensen and J. Neville, ‘‘Linkage and autocorrelation cause feature selection bias in relational learning,’’ in Proc. 19th Int. Conf. Mach. Learn., San Francisco, CA, USA, 2002, pp. 259–266.
- P. W. Holland, K. B. Laskey, and S. Leinhardt, ‘‘Stochastic blockmodels: First steps,’’ Social Netw., vol. 5, no. 2, pp. 109–137, 1983.
- C. J. Anderson, S. Wasserman, and K. Faust, ‘‘Building stochastic blockmodels,’’ Social Netw., vol. 14, Special Issue on Blockmodels, no. 1–2, pp. 137–161, 1992.
- P. Hoff, ‘‘Modeling homophily and stochastic equivalence in symmetric relational data,’’ in Advances in Neural Information Processing Systems 20. Red Hook, NY, USA: Curran, 2008, pp. 657–664.
- J. C. Platt, ‘‘Probabilities for SV machines,’’ in Advances in Large Margin Classifiers. Cambridge, MA, USA: MIT Press, 1999, pp. 61–74.
- P. Orbanz and D. M. Roy, ‘‘Bayesian models of graphs, arrays and other exchangeable random structures,’’ IEEE Trans. Pattern Anal. Machine Intell., 2015.
- M. Nickel, V. Tresp, and H.-P. Kriegel, ‘‘A three-way model for collective learning on multi-relational data,’’ in Proc. 28th Int. Conf. Mach. Learn., 2011, pp. 809–816.
- M. Nickel, ‘‘Factorizing YAGO: Scalable machine learning for linked data,’’ in Proc. 21st Int. Conf. World Wide Web, 2012, pp. 271–280.
- M. Nickel, ‘‘Tensor factorization for relational learning,’’ Ph.D. dissertation, Ludwig-Maximilians-Universitat Munchen, Munich, Germany, Aug. 2013.
- Y. Koren, R. Bell, and C. Volinsky, ‘‘Matrix factorization techniques for recommender systems,’’ IEEE Computer, vol. 42, no. 8, pp. 30–37, 2009.
- T. G. Kolda and B. W. Bader, ‘‘Tensor decompositions and applications,’’ SIAM Rev., vol. 51, no. 3, pp. 455–500, 2009.
- M. Nickel and V. Tresp, ‘‘Logistic tensor-factorization for multi-relational data,’’ in Proc. Structured Learn.: Inferring Graphs Structured Unstructured Inputs (SLG 2013) Workshop, 2013.
- K.-W. Chang, W.-T. Yih, B. Yang, and C. Meek, ‘‘Typed tensor decomposition of knowledge bases for relation extraction,’’ in Proc. 2014 Conf. Empir. Meth. Natural Lang. Process., Oct. 2014.
- S. Kok and P. Domingos, ‘‘Statistical predicate invention,’’ in Proc. 24th Int. Conf. Mach. Learn., New York, NY, USA, 2007, pp. 433–440.
- Z. Xu, V. Tresp, K. Yu, and H.-P. Kriegel, ‘‘Infinite hidden relational models,’’ in Proc. 22nd Int. Conf. Uncertainty Artif. Intell., 2006, pp. 544–551.
- C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, and N. Ueda, ‘‘Learning systems of concepts with an infinite relational model,’’ in Proc. 21st Nat. Conf. Artif. Intell., 2006, vol. 3, p. 5.
- I. Sutskever, J. B. Tenenbaum, and R. R. Salakhutdinov, ‘‘Modelling relational data using Bayesian clustered tensor factorization Adv. Neural Inf. Process. Syst., vol. 22, pp. 1821–1828, 2009.
- D. KrompaQ, M. Nickel, and V. Tresp, ‘‘Large-scale factorization of typeconstrained multi-relational data,’’ in Proc. Int. Conf. Data Sci. Adv. Anal., 2014.
- M. Nickel and V. Tresp, ‘‘Learning taxonomies from multi-relational data via hierarchical link-based clustering,’’ in Proc. Learn. Semant. Workshop, Granada, Spain, 2011.
- T. G. Kolda, B. W. Bader, and J. P. Kenny, ‘‘Higher-order web link analysis using multilinear algebra,’’ in Proc. Fifth IEEE Int. Conf. Data Mining, Washington, DC, USA, 2005, pp. 242–249.
- T. Franz, A. Schultz, S. Sizov, and S. Staab, ‘‘Triplerank: Ranking semantic web data by tensor decomposition,’’ Proc. Semant. Web, 2009, pp. 213–228.
- L. Drumond, S. Rendle, and L. Schmidt-Thieme, ‘‘Predicting RDF triples in incomplete knowledge bases with tensor factorization,’’ in Proc. 27th Annu. ACM Symp. Appl. Comput., Riva del Garda, Italy, 2012, pp. 326–331.
- S. Rendle and L. Schmidt-Thieme, ‘‘Pairwise interaction tensor factorization for personalized tag recommendation,’’ in Proc. Third ACM Int. Conf. Web Search Data Mining, 2010, pp. 81–90.
- S. Rendle, ‘‘Scaling factorization machines to relational data,’’ in Proc. 39th Int. Conf. Very Large Data Bases, Trento, Italy, 2013, pp. 337–348.
- R. Jenatton, N. L. Roux, A. Bordes, and G. R. Obozinski, ‘‘A latent factor model for highly multi-relational data,’’ in Advances in Neural Information Processing Systems 25. Red Hook, NY, USA: Curran, 2012, pp. 3167–3175.
- P. Miettinen, ‘‘Boolean tensor factorizations,’’ in Proc. IEEE 11th Int. Conf. Data Mining, Dec. 2011, pp. 447–456.
- D. Erdos and P. Miettinen, ‘‘Discovering facts with Boolean tensor tucker decomposition,’’ in Proc. 22nd ACM Int. Conf. Inf. Knowl. Manage., New York, NY, USA, 2013, pp. 1569–1572.
- X. Jiang, V. Tresp, Y. Huang, and M. Nickel, ‘‘Link prediction in multi-relational graphs using additive models,’’ in Proc. Int. Workshop Semant. Technol. Recomm. Sys. Big Data ISWC, M. de Gemmis, T. D. Noia, P. Lops, T. Lukasiewicz, and G. Semeraro, Eds., 2012, vol. 919, pp. 1–12.
- S. Riedel, L. Yao, B. M. Marlin, and A. McCallum, ‘‘Relation extraction with matrix factorization and universal schemas,’’ Joint Human Language Technol. Conf./ Annu. Meet. North Amer. Chapter Assoc. Comput. Linguistics, Jun. 2013.
- V. Tresp, Y. Huang, M. Bundschus, and A. Rettinger, ‘‘Materializing and querying learned knowledge,’’ Proc. IRMLeS, vol. 2009, 2009.
- Y. Huang, V. Tresp, M. Nickel, A. Rettinger, and H.-P. Kriegel, ‘‘A scalable approach for statistical learning in semantic graphs,’’ Semant. Web J., 2013.
- P. Smolensky, ‘‘Tensor product variable binding and the representation of symbolic structures in connectionist systems,’’ Artif. Intell., vol. 46, no. 1, pp. 159–216, 1990.
- G. S. Halford, W. H. Wilson, and S. Phillips, ‘‘Processing capacity defined by relational complexity: Implications for comparative, developmental, cognitive psychology,’’ Behav. Brain Sci., vol. 21, no. 06, pp. 803–831, 1998.
- T. Plate, ‘‘A common framework for distributed representation schemes for compositional structure,’’ Connect. Syst. Knowl. Represent. Deduct., pp. 15–34, 1997.
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, ‘‘Efficient estimation of word representations in vector space,’’ in Proc. Workshop ICLR, 2013.
- R. Socher, D. Chen, C. D. Manning, and A. Ng, ‘‘Reasoning with neural tensor networks for knowledge base completion,’’ in Advances in Neural Information Processing Systems 26. Red Hook, NY, USA: Curran, 2013, pp. 926–934.
- A. Bordes, J. Weston, R. Collobert, and Y. Bengio, ‘‘Learning structured embeddings of knowledge bases,’’ in Proc. 25th AAAI Conf. Artif. Intell., San Francisco, CA, USA, 2011.
- A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko, ‘‘Translating embeddings for modeling multi-relational data,’’ in Advances in Neural Information Processing Systems 26. Red Hook, NY, USA: Curran, 2013, pp. 2787–2795.
- B. Yang, W.-T. Yih, X. He, J. Gao, and L. Deng, ‘‘Embedding entities and relations for learning and inference in knowledge bases,’’ CoRR, vol. abs/1412 6575, 2014.
- P. D. Hoff, A. E. Raftery, and M. S. Handcock, ‘‘Latent space approaches to social network analysis,’’ J. Amer. Stat. Assoc., vol. 97, no. 460, pp. 1090–1098, 2002.
- L. Luand T. Zhou, ‘‘Link prediction in complex networks: A survey,’’ Physica A, Stat. Mechan. Appl., vol. 390, no. 6, pp. 1150–1170, Mar. 2011.
- L. A. Adamic and E. Adar, ‘‘Friends and neighbors on the web,’’ Social Netw., vol. 25, no. 3, pp. 211–230, 2003.
- A.-L. Barabasi and R. Albert, ‘‘Emergence of scaling in random networks,’’ Science, vol. 286, no. 5439, pp. 509–512, 1999.
- L. Katz, ‘‘A new status index derived from sociometric analysis,’’ Psychometrika, vol. 18, no. 1, pp. 39–43, 1953.
- E. A. Leicht, P. Holme, and M. E. Newman, ‘‘Vertex similarity in networks,’’ Phys. Rev. E, vol. 73, no. 2, p. 026120, 2006.
- S. Brin and L. Page, ‘‘The anatomy of a large-scale hypertextual web search engine,’’ Comput. Netw. ISDN Syst., vol. 30, no. 1, pp. 107–117, 1998.
- W. Liu and L. Lu, ‘‘Link prediction based on local random walk,’’ Europhys. Lett., vol. 89, no. 5, p. 58007, 2010.
- S. Muggleton, ‘‘Inverse entailment and progol,’’ New Gen. Comput., vol. 13, no. 3/4, pp. 245–286, 1995.
- J. R. Quinlan, ‘‘Inductive logic programming,’’ New Gen. Computi., vol. 8, no. 4, pp. 295–318, 1991.
- J. R. Quinlan, ‘‘Learning logical definitions from relations,’’ Mach. Learn., vol. 5, pp. 239–266, 1990.
- L. A. Galarraga, C. Teflioudi, K. Hose, and F. Suchanek, ‘‘AMIE: Association rule mining under incomplete evidence in ontological knowledge bases,’’ in Proc. 22nd Int. Conf. World Wide Web, 2013, pp. 413–422.
- L. Galarraga, C. Teflioudi, K. Hose, and F. Suchanek, ‘‘Fast rule mining in ontological knowledge bases with AMIE+,’’ VLDB J., pp. 1–24, 2015.
- F. A. Lisi, ‘‘Inductive logic programming in databases: From datalog to dl+log,’’ TPLP, vol. 10, no. 3, pp. 331–359, 2010.
- C. d’Amato, N. Fanizzi, and F. Esposito, ‘‘Reasoning by analogy in description logics through instance-based learning,’’ in Proc. Semant. Web Appl. Perspect., 2006.
- J. Lehmann, ‘‘DL-learner: Learning concepts in description logics,’’ J. Mach. Learn. Res., vol. 10, pp. 2639–2642, 2009.
- A. Rettinger, U. Losch, V. Tresp, C. d’Amato, and N. Fanizzi, ‘‘Mining the semantic webVStatistical learning for next generation knowledge bases,’’ Data Min. Knowl. Discov., vol. 24, no. 3, pp. 613–662, 2012.
- U. Losch, S. Bloehdorn, and A. Rettinger, ‘‘Graph kernels for rdf data,’’ in Proc. 9th Int. Conf. Semant. Web: Res. Appl., 2012, pp. 134–148.
- P. Minervini, N. Fanizzi, and V. Tresp, ‘‘Learning to propagate knowledge in web ontologies,’’ in Proc. 10th Int. Workshop Uncertain. Reason. Semant. Web, 2014, pp. 13–24.
- N. Lao and W. W. Cohen, ‘‘Relational retrieval using a combination of pathconstrained random walks,’’ Mach. Learn., vol. 81, no. 1, pp. 53–67, 2010.
- N. Lao, T. Mitchell, and W. W. Cohen, ‘‘Random walk inference and learning in a large scale knowledge base,’’ in Proc. Conf. Empir. Meth. Nat. Lang. Process., 2011, pp. 529–539.
- K. Toutanova and D. Chen, ‘‘Observed versus latent features for knowledge base and text inference,’’ Proc. 3rd Workshop Continuous Vector Space Models Compositionality, 2015.
- M. Nickel, X. Jiang, and V. Tresp, ‘‘Reducing the rank in relational factorization models by including observable patterns,’’ in Advances in Neural Information Processing Systems 27. Red Hook, NY, USA: Curran, 2014, pp. 1179–1187.
- Y. Koren, ‘‘Factorization meets the neighborhood: A multifaceted collaborative filtering model,’’ in Proc. 14th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2008, pp. 426–434.
- S. Rendle, ‘‘Factorization machines with libFM,’’ ACM Trans. Intell. Syst. Technol., vol. 3, no. 3, p. 57, 2012.
- D. H. Wolpert, ‘‘Stacked generalization,’’ Neural Netw., vol. 5, no. 2, pp. 241–259, 1992.
- L. Bottou, ‘‘Large-scale machine learning with stochastic gradient descent,’’ in Proc. COMPSTAT, 2010, pp. 177–186.
- K. P. Murphy, Machine Learning: A Probabilistic Perspective. Cambridge, MA, USA: MIT Press, 2012.
- J. Davis and M. Goadrich, ‘‘The relationship between precision-recall and ROC curves,’’ in Proc. 23rd Int. Conf. Mach. Learn., 2006, pp. 233–240, ACM.
- D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques. Cambridge, MA, USA: MIT Press, 2009.
- M. Richardson and P. Domingos, ‘‘Markov logic networks,’’ Mach. Learn., vol. 62, no. 1, pp. 107–136, 2006.
- C. Zhang and C. Re, ‘‘Towards high-throughput Gibbs sampling at scale: A study across storage managers,’’ in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2013, pp. 397–408.
- H. Poon and P. Domingos, ‘‘Sound and efficient inference with probabilistic and deterministic dependencies,’’ in Proc. AAAI, 2006.
- A. Globerson and T. S. Jaakkola, ‘‘Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations,’’ in NIPS, 2007.
- S. H. Bach, M. Broecheler, B. Huang, and L. Getoor, ‘‘Hinge-loss Markov random fields and probabilistic soft logic,’’ arXiv:1505.04406, 2015, [cs.LG].
- A. Kimmig, S. H. Bach, M. Broecheler, B. Huang, and L. Getoor, ‘‘A short introduction to probabilistic soft logic,’’ in Proc. NIPS Workshop Probab. Programming: Found. Appl., 2012.
- J. Pujara, H. Miao, L. Getoor, and W. W. Cohen, ‘‘Using semantics and statistics to turn data into knowledge,’’ AI Mag., 2015.
- J. Neville and D. Jensen, ‘‘Relational dependency networks,’’ J. Mach. Learn. Res., vol. 8, pp. 637–652, May 2007.
- D. KrompaQ, X. Jiang, M. Nickel, and V. Tresp, ‘‘Probabilistic latent-factor database models,’’ in Proc. 1st Workshop Linked Data Knowl. Discovery European Conf. Mach. Learn. Principles Practice Knowl. Discovery Databases, 2014.
- H. Li et al., ‘‘Improvement of n-ary relation extraction by adding lexical semantics to distant-supervision rule learning,’’ in Proc. 7th Int. Conf. Agents Artif. Intell., Lisbon, Portugal, pp. 317–324, Jan. 10–12, 2015.
- H. Ji, T. Cassidy, Q. Li, and S. Tamang, ‘‘Tackling representation, annotation and classification challenges for temporal knowledge base population,’’ Knowl. Inf. Syst., pp. 1–36, Aug. 2013.
- D. L. McGuinness and F. Van Harmelen, ‘‘OWL web ontology language overview,’’ W3C Recommend., vol. 10, no. 10, p. 2004, 2004.
- A. Hogan, A. Harth, A. Passant, S. Decker, and A. Polleres, ‘‘Weaving the pedantic web,’’ in Proc. 3rd Int. Workshop Linked Data Web (LDOW2010)/19th Int. World Wide Web Conf., Raleigh, NC, USA, 2010.
- H. Halpin, P. Hayes, J. McCusker, D. Mcguinness, and H. Thompson, ‘‘When owl: SameAs isn’t the same: An analysis of identity in linked data Proc. Semant. Web, 2010, pp. 305–320.
- D. KrompaQ, M. Nickel, and V. Tresp, ‘‘Querying factorized probabilistic triple databases,’’ Proc. Semant. Web, 2014, pp. 114–129.
- D. Suciu, D. Olteanu, C. Re, and C. Koch, Probabilistic Databases. San Raphael, CA, USA: Morgan and Claypool, 2011.
- D. Z. Wang, E. Michelakis, M. Garofalakis, and J. M. Hellerstein, ‘‘BayesStore: Managing large, uncertain data repositories with probabilistic graphical models,’’ Proc. VLDB Endow., vol. 1, no. 1, pp. 340–351, 2008.
- J. Bleiholder and F. Naumann, ‘‘Data fusion,’’ ACM Comput. Surv., vol. 41, no. 1, pp. 1:1–1:41, Jan. 2009.
- X. Li, X. L. Dong, K. Lyons, W. Meng, and D. Srivastava, ‘‘Truth finding on the deep web: Is the problem solved?’’ Proc. VLDB Endow., vol. 6, no. 2, pp. 97–108, Dec. 2012.
- X. L. Dong et al., ‘‘From data fusion to knowledge fusion,’’ Proc. VLDB Endow., vol. 7, no. 10, pp. 881–892, Jun. 2014.
- X. L. Dong et al., ‘‘Knowledge-based trust: Estimating the trustworthiness of web sources,’’ Proc. VLDB Endow., vol. 8, no. 9, pp. 938–949, May 2015.
- M. Nickel and V. Tresp, ‘‘Tensor factorization for multirelational learning,’’ Machine Learning and Knowledge Discovery in Databases, ser. Lecture Notes in Computer Science, Berlin, Germany: Springer-Verlag, vol. 8190, pp. 617–621, 2013. Maximilian Nickel received the Ph.D. degree (summa cum laude) from the Ludwig Maximilian University, Munich, Germany, in 2013.
- He is a Postdoctoral Fellow with the Laboratory for Computational and Statistical Learning, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA, and the Istituto Italiano di Tecnologia, Genova, Italy. He is also with the Center for Brains, Minds, and Machines at MIT. From 2010 to 2013, he worked as a Research Assistant at Siemens Corporate Technology, Munich, Germany. His research centers around machine learning from relational knowledge representations and graph-structured data as well as its applications in artificial intelligence and cognitive science.
- Currently, he is a Research Scientist at Google, Mountain View, CA, USA, where he works on AI, machine learning, computer vision, knowledge base construction, and natural language processing. Before joining Google in 2011, he was an Associate Professor of Computer Science and Statistics at the University of British Columbia (UBC), Vancouver, BC, Canada. Before starting at UBC in 2004, he was a Postdoctoral Researcher at the Massachusetts Institute of Technology (MIT), Cambridge, MA, USA. He has published over 80 papers in refereed conferences and journals, as well as the 1100-page textbook Machine Learning: a Probabilistic Perspective (Cambridge, MA, USA: MIT Press, 2012), which was awarded the 2013 DeGroot Prize for best book in the field of Statistical Science.
- Dr. Murphy is the Co-Editor-in-Chief of the Journal of Machine Learning Research.
- Volker Tresp received the Diploma degree from the University of Goettingen, Germany, in 1984 and the M.Sc. and Ph.D. degrees from Yale University, New Haven, CT, USA, in 1986 and 1989, respectively.
- Since 1989, he has been the head of various research teams in machine learning at Siemens, Research and Technology, Munich, Germany. He filed more than 70 patent applications and was inventor of the year of Siemens in 1996. He has published more than 100 scientific articles and administered over 20 Ph.D. dissertations. The company Panoratio is a spin-off out of his team. His research focus in recent years has been machine learning in information networks for modeling knowledge graphs, medical decision processes, and sensor networks. He is the coordinator of one of the first nationally funded big data projects for the realization of precision medicine. In 2011, he became a Honorary Professor at the Ludwig Maximilian University of Munich, Germany, where he teaches an annual course on machine learning.
- He is a Senior Staff Research Scientist at Google, Mountain View, CA, USA, where he works on knowledge discovery from the web. Prior to joining Google in 2012, he was a Director of Research and Head of the Natural Language Processing and Information Retrieval Group at Yahoo! Research.
- Dr. Gabbrilovich is an ACM Distinguished Scientist, and is a recipient of the 2014 IJCAI-JAIR Best Paper Prize. He is also a recipient of the 2010 Karen Sparck Jones Award for his contributions to natural language processing and information retrieval.

Tags

Comments