A Degeneracy Framework for Graph Similarity

IJCAI, pp. 2595-2601, 2018.

Cited by: 28|Bibtex|Views193|Links
EI
Keywords:
graph kernelpyramid match graph kernelgraph classificationkernel functionweisfeiler lehmanMore(17+)
Weibo:
We defined a general framework for improving the performance of graph comparison algorithms

Abstract:

The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Most existing methods for graph similarity focus either on local or on global properties of graphs. However, even if graphs seem very similar from a local or a global perspective, they may exhibit different st...More

Code:

Data:

Introduction
  • Graphs are well-studied structures which are utilized to model entities and their relationships.
  • In the past years, graph classification has arisen as an important topic in many domains such as in Computational Biology [Scholkopf et al, 2004], in Chemistry [Maheand Vert, 2009] and in Natural Language Processing [Nikolentzos et al, 2017a].
  • Kernel functions do not require their inputs to be represented as fixed-length feature vectors, and they can be defined on structured data such as graphs, trees and strings.
  • Kernel methods provide a flexible framework for performing graph classification
Highlights
  • Graphs are well-studied structures which are utilized to model entities and their relationships
  • We propose a new framework for graph similarity that is based on the concept of k-core, and we show how existing graph kernels can be plugged into the framework to produce more powerful kernels
  • The results show that the hierarchy of nested subgraphs generated by the k-core decomposition allows existing algorithms to compare structure in graphs at multiple different scales
  • Core Weisfeiler-Lehman subtree kernel yielded in general only slightly better accuracies compared to its base kernel
  • For small values of the parameter h of Weisfeiler-Lehman subtree kernel, the summaries that are generated in a k-core are very similar to those generated in the whole graph and do not provide much additional information
  • We defined a general framework for improving the performance of graph comparison algorithms
Methods
  • GR CORE GR SP CORE SP WL CORE WL PM CORE PM MUTAG.
  • 69.97 (± 2.22) 82.34 (± 1.29) 84.03 (± 1.49) 88.29 (± 1.55) 83.63 (± 1.57) 87.47 (± 1.08) 80.66 (± 0.90) 87.19 (± 1.47) ENZYMES.
  • 33.08 (± 0.93) 33.66 (± 0.65) 40.75 (± 0.81) 41.20 (± 1.21) 51.56 (± 2.75) 47.82 (± 4.62) 42.17 (± 2.02) 42.42 (± 1.06) NCI1.
  • 65.47 (± 0.14) 66.85 (± 0.20) 72.85 (± 0.24) 73.46 (± 0.32) 84.42 (± 0.25) 85.01 (± 0.19) 72.27 (± 0.59) 74.90 (± 0.45) PTC-MR
Results
  • The authors begin the experiments by comparing the base kernels with their core variants.
  • The core variants outperformed their base kernels on 37 out of the 40 experiments.
  • It should be mentioned that the difference in performance between the core variants and their base kernels was larger on the social interaction datasets compared to the bioinformatics and chemoinformatics datasets.
  • Core GR improved by more than 10% the accuracy attained by the GR kernel on 4 datasets.
  • For small values of the parameter h of WL, the summaries that are generated in a k-core are very similar to those generated in the whole graph and do not provide much additional information
Conclusion
  • The authors defined a general framework for improving the performance of graph comparison algorithms.
  • The conducted experiments highlight the superiority in terms of accuracy of the core variants over their base kernels at the expense of only a slight increase in computational time
Summary
  • Introduction:

    Graphs are well-studied structures which are utilized to model entities and their relationships.
  • In the past years, graph classification has arisen as an important topic in many domains such as in Computational Biology [Scholkopf et al, 2004], in Chemistry [Maheand Vert, 2009] and in Natural Language Processing [Nikolentzos et al, 2017a].
  • Kernel functions do not require their inputs to be represented as fixed-length feature vectors, and they can be defined on structured data such as graphs, trees and strings.
  • Kernel methods provide a flexible framework for performing graph classification
  • Methods:

    GR CORE GR SP CORE SP WL CORE WL PM CORE PM MUTAG.
  • 69.97 (± 2.22) 82.34 (± 1.29) 84.03 (± 1.49) 88.29 (± 1.55) 83.63 (± 1.57) 87.47 (± 1.08) 80.66 (± 0.90) 87.19 (± 1.47) ENZYMES.
  • 33.08 (± 0.93) 33.66 (± 0.65) 40.75 (± 0.81) 41.20 (± 1.21) 51.56 (± 2.75) 47.82 (± 4.62) 42.17 (± 2.02) 42.42 (± 1.06) NCI1.
  • 65.47 (± 0.14) 66.85 (± 0.20) 72.85 (± 0.24) 73.46 (± 0.32) 84.42 (± 0.25) 85.01 (± 0.19) 72.27 (± 0.59) 74.90 (± 0.45) PTC-MR
  • Results:

    The authors begin the experiments by comparing the base kernels with their core variants.
  • The core variants outperformed their base kernels on 37 out of the 40 experiments.
  • It should be mentioned that the difference in performance between the core variants and their base kernels was larger on the social interaction datasets compared to the bioinformatics and chemoinformatics datasets.
  • Core GR improved by more than 10% the accuracy attained by the GR kernel on 4 datasets.
  • For small values of the parameter h of WL, the summaries that are generated in a k-core are very similar to those generated in the whole graph and do not provide much additional information
  • Conclusion:

    The authors defined a general framework for improving the performance of graph comparison algorithms.
  • The conducted experiments highlight the superiority in terms of accuracy of the core variants over their base kernels at the expense of only a slight increase in computational time
Tables
  • Table1: Classification accuracy (± standard deviation) of the graphlet kernel (GR), shortest path kernel (SP), Weisfeiler-Lehman subtree kernel (WL), pyramid match kernel (PM) and their core variants on the 10 graph classification datasets. Core variants with statistically significant improvements over the base kernels are shown in bold as measured by a t-test with a p value of ≤ 0.05
  • Table2: Comparison of running times of base kernels vs their core variants. The values indicate the relative increase in running time when compared to the corresponding base kernel
Download tables as Excel
Funding
  • Giannis Nikolentzos is supported by the project “ESIGMA” (ANR-17-CE40-0028)
Reference
  • [Alvarez-Hamelin et al., 2006] I.
    Google ScholarFindings
  • Processing Systems, 18:41–50, 2006.
    Google ScholarFindings
  • [Batagelj and Zaversnik, 2011] V. Batagelj and M. Zaversnik. Fast algorithms for determining (generalized)
    Google ScholarFindings
  • core groups in social networks. Advances in Data Analysis and Classification, 5(2):129–145, 2011.
    Google ScholarLocate open access versionFindings
  • [Borgwardt and Kriegel, 2005] K.M. Borgwardt and H. Kriegel. Shortest-path kernels on graphs. In Proceedings of the 5th International Conference on Data Mining, pages 74–81, 2005.
    Google ScholarLocate open access versionFindings
  • [Conte et al., 2004] D. Conte, P. Foggia, C. Sansone, and M. Vento. Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence, 18(03):265–298, 2004.
    Google ScholarLocate open access versionFindings
  • [Dai et al., 2016] H. Dai, B. Dai, and L. Song. Discriminative Embeddings of Latent Variable Models for Structured Data. In Proceedings of The 33rd International Conference on Machine Learning, pages 2702–2711, 2016.
    Google ScholarLocate open access versionFindings
  • [Erdos and Hajnal, 1966] P. Erdos and A. Hajnal. On chromatic number of graphs and set-systems. Acta Mathematica Hungarica, 17(1-2):61–99, 1966.
    Google ScholarLocate open access versionFindings
  • [Gartner et al., 2003] T. Gartner, P. Flach, and S. Wrobel. On Graph Kernels: Hardness Results and Efficient Alternatives. In Learning Theory and Kernel Machines, pages 129–143. 2003.
    Google ScholarLocate open access versionFindings
  • [Giatsidis et al., 2014] C. Giatsidis, F. Malliaros, D. Thilikos, and M. Vazirgiannis. CORECLUSTER: A Degeneracy Based Graph Clustering Framework. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, pages 44–50, 2014.
    Google ScholarLocate open access versionFindings
  • [Haussler, 1999] D. Haussler. Convolution kernels on discrete structures. Technical Report, 1999.
    Google ScholarFindings
  • [Horvath et al., 2004] T. Horvath, T. Gartner, and S. Wrobel. Cyclic Pattern Kernels for Predictive Graph Mining. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 158–167, 2004.
    Google ScholarLocate open access versionFindings
  • [Johansson and Dubhashi, 2015] F. Johansson and D. Dubhashi. Learning with Similarity Functions on Graphs using Matchings of Geometric Embeddings. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 467–476, 2015.
    Google ScholarLocate open access versionFindings
  • [Johansson et al., 2014] F. Johansson, V. Jethava, D. Dubhashi, and C. Bhattacharyya. Global graph kernels using geometric embeddings. In Proceedings of the 31st International Conference on Machine Learning, pages 694–702, 2014.
    Google ScholarLocate open access versionFindings
  • [Kondor and Pan, 2016] R. Kondor and H. Pan. The Multiscale Laplacian Graph Kernel. In Advances in Neural Information Processing Systems, pages 2982–2990, 2016.
    Google ScholarLocate open access versionFindings
  • [Kriege and Mutzel, 2012] N. Kriege and P. Mutzel. Subgraph Matching Kernels for Attributed Graphs. In Proceedings of the 29th International Conference on Machine Learning, pages 1015–1022, 2012.
    Google ScholarLocate open access versionFindings
  • [Lee et al., 2010] V. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data, pages 303–336. 2010.
    Google ScholarLocate open access versionFindings
  • [Lick and White, 1970] D. Lick and A. White. k-degenerate graphs. Canadian J. of Mathematics, 22:1082–1096, 1970.
    Google ScholarLocate open access versionFindings
  • [Maheand Vert, 2009] P. Maheand J. Vert. Graph kernels based on tree patterns for molecules. Machine Learning, 75(1):3–35, 2009.
    Google ScholarLocate open access versionFindings
  • [Matula and Beck, 1983] D. Matula and L. Beck. Smallestlast Ordering and Clustering and Graph Coloring Algorithms. Journal of the ACM, 30(3):417–427, 1983.
    Google ScholarLocate open access versionFindings
  • [Neumann et al., 2016] M. Neumann, R. Garnett, C. Bauckhage, and K. Kersting. Propagation kernels: efficient graph kernels from propagated information. Machine Learning, 102(2):209–245, 2016.
    Google ScholarLocate open access versionFindings
  • [Nikolentzos et al., 2017a] G. Nikolentzos, P. Meladianos, F. Rousseau, Y. Stavrakas, and M. Vazirgiannis. ShortestPath Graph Kernels for Document Similarity. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1890–1900, 2017.
    Google ScholarLocate open access versionFindings
  • [Nikolentzos et al., 2017b] G. Nikolentzos, P. Meladianos, and M. Vazirgiannis. Matching Node Embeddings for Graph Similarity. In Proceedings of the 31st AAAI Conference in Artificial Intelligence, pages 2429–2435, 2017.
    Google ScholarLocate open access versionFindings
  • [Scholkopf et al., 2004] B. Scholkopf, K. Tsuda, and J.P. Vert. Kernel Methods in Computational Biology. MIT press, 2004.
    Google ScholarFindings
  • [Seidman, 1983] S. Seidman. Network Structure and Minimum Degree. Social networks, 5(3):269–287, 1983.
    Google ScholarLocate open access versionFindings
  • [Shervashidze et al., 2009] N. Shervashidze, T. Petri, K. Mehlhorn, K.M. Borgwardt, and S.V.N. Vishwanathan. Efficient Graphlet Kernels for Large Graph Comparison. In Proceedings of the International Conference on Artificial Intelligence and Statistics, pages 488–495, 2009.
    Google ScholarLocate open access versionFindings
  • [Shervashidze et al., 2011] N. Shervashidze, P. Schweitzer, E. J. Van Leeuwen, K. Mehlhorn, and K.M. Borgwardt. Weisfeiler-Lehman Graph Kernels. The Journal of Machine Learning Research, 12:2539–2561, 2011.
    Google ScholarLocate open access versionFindings
  • [Smola and Scholkopf, 1998] A. Smola and B. Scholkopf. Learning with kernels. Forschungszentrum Informationstechnik, 1998.
    Google ScholarFindings
  • [Sugiyama and Borgwardt, 2015] M. Sugiyama and K.M. Borgwardt. Halting in random walk kernels. In Advances in Neural Information Processing Systems, pages 1639– 1647, 2015.
    Google ScholarLocate open access versionFindings
  • [Vishwanathan et al., 2010] S.V.N.
    Google ScholarFindings
  • Graph Kernels. The Journal of Machine Learning
    Google ScholarLocate open access versionFindings
  • Research, 11:1201–1242, 2010.
    Google ScholarFindings
  • [Wuchty and Almaas, 2005] S. Wuchty and E. Almaas. Peeling the yeast protein network. Proteomics, 5(2):444–449, 2005.
    Google ScholarLocate open access versionFindings
  • [Yanardag and Vishwanathan, 2015] P. Yanardag and S.V.N. Vishwanathan. Deep Graph Kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1365–1374, 2015.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Best Paper
Best Paper of IJCAI, 2018
Tags
Comments