# A Degeneracy Framework for Graph Similarity

IJCAI, pp. 2595-2601, 2018.

EI

Keywords:

Weibo:

Abstract:

The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Most existing methods for graph similarity focus either on local or on global properties of graphs. However, even if graphs seem very similar from a local or a global perspective, they may exhibit different st...More

Code:

Data:

Introduction

- Graphs are well-studied structures which are utilized to model entities and their relationships.
- In the past years, graph classification has arisen as an important topic in many domains such as in Computational Biology [Scholkopf et al, 2004], in Chemistry [Maheand Vert, 2009] and in Natural Language Processing [Nikolentzos et al, 2017a].
- Kernel functions do not require their inputs to be represented as fixed-length feature vectors, and they can be defined on structured data such as graphs, trees and strings.
- Kernel methods provide a flexible framework for performing graph classification

Highlights

- Graphs are well-studied structures which are utilized to model entities and their relationships
- We propose a new framework for graph similarity that is based on the concept of k-core, and we show how existing graph kernels can be plugged into the framework to produce more powerful kernels
- The results show that the hierarchy of nested subgraphs generated by the k-core decomposition allows existing algorithms to compare structure in graphs at multiple different scales
- Core Weisfeiler-Lehman subtree kernel yielded in general only slightly better accuracies compared to its base kernel
- For small values of the parameter h of Weisfeiler-Lehman subtree kernel, the summaries that are generated in a k-core are very similar to those generated in the whole graph and do not provide much additional information
- We defined a general framework for improving the performance of graph comparison algorithms

Methods

- GR CORE GR SP CORE SP WL CORE WL PM CORE PM MUTAG.
- 69.97 (± 2.22) 82.34 (± 1.29) 84.03 (± 1.49) 88.29 (± 1.55) 83.63 (± 1.57) 87.47 (± 1.08) 80.66 (± 0.90) 87.19 (± 1.47) ENZYMES.
- 33.08 (± 0.93) 33.66 (± 0.65) 40.75 (± 0.81) 41.20 (± 1.21) 51.56 (± 2.75) 47.82 (± 4.62) 42.17 (± 2.02) 42.42 (± 1.06) NCI1.
- 65.47 (± 0.14) 66.85 (± 0.20) 72.85 (± 0.24) 73.46 (± 0.32) 84.42 (± 0.25) 85.01 (± 0.19) 72.27 (± 0.59) 74.90 (± 0.45) PTC-MR

Results

- The authors begin the experiments by comparing the base kernels with their core variants.
- The core variants outperformed their base kernels on 37 out of the 40 experiments.
- It should be mentioned that the difference in performance between the core variants and their base kernels was larger on the social interaction datasets compared to the bioinformatics and chemoinformatics datasets.
- Core GR improved by more than 10% the accuracy attained by the GR kernel on 4 datasets.
- For small values of the parameter h of WL, the summaries that are generated in a k-core are very similar to those generated in the whole graph and do not provide much additional information

Conclusion

- The authors defined a general framework for improving the performance of graph comparison algorithms.
- The conducted experiments highlight the superiority in terms of accuracy of the core variants over their base kernels at the expense of only a slight increase in computational time

Summary

## Introduction:

Graphs are well-studied structures which are utilized to model entities and their relationships.- In the past years, graph classification has arisen as an important topic in many domains such as in Computational Biology [Scholkopf et al, 2004], in Chemistry [Maheand Vert, 2009] and in Natural Language Processing [Nikolentzos et al, 2017a].
- Kernel functions do not require their inputs to be represented as fixed-length feature vectors, and they can be defined on structured data such as graphs, trees and strings.
- Kernel methods provide a flexible framework for performing graph classification
## Methods:

GR CORE GR SP CORE SP WL CORE WL PM CORE PM MUTAG.- 69.97 (± 2.22) 82.34 (± 1.29) 84.03 (± 1.49) 88.29 (± 1.55) 83.63 (± 1.57) 87.47 (± 1.08) 80.66 (± 0.90) 87.19 (± 1.47) ENZYMES.
- 33.08 (± 0.93) 33.66 (± 0.65) 40.75 (± 0.81) 41.20 (± 1.21) 51.56 (± 2.75) 47.82 (± 4.62) 42.17 (± 2.02) 42.42 (± 1.06) NCI1.
- 65.47 (± 0.14) 66.85 (± 0.20) 72.85 (± 0.24) 73.46 (± 0.32) 84.42 (± 0.25) 85.01 (± 0.19) 72.27 (± 0.59) 74.90 (± 0.45) PTC-MR
## Results:

The authors begin the experiments by comparing the base kernels with their core variants.- The core variants outperformed their base kernels on 37 out of the 40 experiments.
- It should be mentioned that the difference in performance between the core variants and their base kernels was larger on the social interaction datasets compared to the bioinformatics and chemoinformatics datasets.
- Core GR improved by more than 10% the accuracy attained by the GR kernel on 4 datasets.
- For small values of the parameter h of WL, the summaries that are generated in a k-core are very similar to those generated in the whole graph and do not provide much additional information
## Conclusion:

The authors defined a general framework for improving the performance of graph comparison algorithms.- The conducted experiments highlight the superiority in terms of accuracy of the core variants over their base kernels at the expense of only a slight increase in computational time

- Table1: Classification accuracy (± standard deviation) of the graphlet kernel (GR), shortest path kernel (SP), Weisfeiler-Lehman subtree kernel (WL), pyramid match kernel (PM) and their core variants on the 10 graph classification datasets. Core variants with statistically significant improvements over the base kernels are shown in bold as measured by a t-test with a p value of ≤ 0.05
- Table2: Comparison of running times of base kernels vs their core variants. The values indicate the relative increase in running time when compared to the corresponding base kernel

Funding

- Giannis Nikolentzos is supported by the project “ESIGMA” (ANR-17-CE40-0028)

Reference

- [Alvarez-Hamelin et al., 2006] I.
- Processing Systems, 18:41–50, 2006.
- [Batagelj and Zaversnik, 2011] V. Batagelj and M. Zaversnik. Fast algorithms for determining (generalized)
- core groups in social networks. Advances in Data Analysis and Classification, 5(2):129–145, 2011.
- [Borgwardt and Kriegel, 2005] K.M. Borgwardt and H. Kriegel. Shortest-path kernels on graphs. In Proceedings of the 5th International Conference on Data Mining, pages 74–81, 2005.
- [Conte et al., 2004] D. Conte, P. Foggia, C. Sansone, and M. Vento. Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence, 18(03):265–298, 2004.
- [Dai et al., 2016] H. Dai, B. Dai, and L. Song. Discriminative Embeddings of Latent Variable Models for Structured Data. In Proceedings of The 33rd International Conference on Machine Learning, pages 2702–2711, 2016.
- [Erdos and Hajnal, 1966] P. Erdos and A. Hajnal. On chromatic number of graphs and set-systems. Acta Mathematica Hungarica, 17(1-2):61–99, 1966.
- [Gartner et al., 2003] T. Gartner, P. Flach, and S. Wrobel. On Graph Kernels: Hardness Results and Efficient Alternatives. In Learning Theory and Kernel Machines, pages 129–143. 2003.
- [Giatsidis et al., 2014] C. Giatsidis, F. Malliaros, D. Thilikos, and M. Vazirgiannis. CORECLUSTER: A Degeneracy Based Graph Clustering Framework. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, pages 44–50, 2014.
- [Haussler, 1999] D. Haussler. Convolution kernels on discrete structures. Technical Report, 1999.
- [Horvath et al., 2004] T. Horvath, T. Gartner, and S. Wrobel. Cyclic Pattern Kernels for Predictive Graph Mining. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 158–167, 2004.
- [Johansson and Dubhashi, 2015] F. Johansson and D. Dubhashi. Learning with Similarity Functions on Graphs using Matchings of Geometric Embeddings. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 467–476, 2015.
- [Johansson et al., 2014] F. Johansson, V. Jethava, D. Dubhashi, and C. Bhattacharyya. Global graph kernels using geometric embeddings. In Proceedings of the 31st International Conference on Machine Learning, pages 694–702, 2014.
- [Kondor and Pan, 2016] R. Kondor and H. Pan. The Multiscale Laplacian Graph Kernel. In Advances in Neural Information Processing Systems, pages 2982–2990, 2016.
- [Kriege and Mutzel, 2012] N. Kriege and P. Mutzel. Subgraph Matching Kernels for Attributed Graphs. In Proceedings of the 29th International Conference on Machine Learning, pages 1015–1022, 2012.
- [Lee et al., 2010] V. Lee, N. Ruan, R. Jin, and C. Aggarwal. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data, pages 303–336. 2010.
- [Lick and White, 1970] D. Lick and A. White. k-degenerate graphs. Canadian J. of Mathematics, 22:1082–1096, 1970.
- [Maheand Vert, 2009] P. Maheand J. Vert. Graph kernels based on tree patterns for molecules. Machine Learning, 75(1):3–35, 2009.
- [Matula and Beck, 1983] D. Matula and L. Beck. Smallestlast Ordering and Clustering and Graph Coloring Algorithms. Journal of the ACM, 30(3):417–427, 1983.
- [Neumann et al., 2016] M. Neumann, R. Garnett, C. Bauckhage, and K. Kersting. Propagation kernels: efficient graph kernels from propagated information. Machine Learning, 102(2):209–245, 2016.
- [Nikolentzos et al., 2017a] G. Nikolentzos, P. Meladianos, F. Rousseau, Y. Stavrakas, and M. Vazirgiannis. ShortestPath Graph Kernels for Document Similarity. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1890–1900, 2017.
- [Nikolentzos et al., 2017b] G. Nikolentzos, P. Meladianos, and M. Vazirgiannis. Matching Node Embeddings for Graph Similarity. In Proceedings of the 31st AAAI Conference in Artificial Intelligence, pages 2429–2435, 2017.
- [Scholkopf et al., 2004] B. Scholkopf, K. Tsuda, and J.P. Vert. Kernel Methods in Computational Biology. MIT press, 2004.
- [Seidman, 1983] S. Seidman. Network Structure and Minimum Degree. Social networks, 5(3):269–287, 1983.
- [Shervashidze et al., 2009] N. Shervashidze, T. Petri, K. Mehlhorn, K.M. Borgwardt, and S.V.N. Vishwanathan. Efficient Graphlet Kernels for Large Graph Comparison. In Proceedings of the International Conference on Artificial Intelligence and Statistics, pages 488–495, 2009.
- [Shervashidze et al., 2011] N. Shervashidze, P. Schweitzer, E. J. Van Leeuwen, K. Mehlhorn, and K.M. Borgwardt. Weisfeiler-Lehman Graph Kernels. The Journal of Machine Learning Research, 12:2539–2561, 2011.
- [Smola and Scholkopf, 1998] A. Smola and B. Scholkopf. Learning with kernels. Forschungszentrum Informationstechnik, 1998.
- [Sugiyama and Borgwardt, 2015] M. Sugiyama and K.M. Borgwardt. Halting in random walk kernels. In Advances in Neural Information Processing Systems, pages 1639– 1647, 2015.
- [Vishwanathan et al., 2010] S.V.N.
- Graph Kernels. The Journal of Machine Learning
- Research, 11:1201–1242, 2010.
- [Wuchty and Almaas, 2005] S. Wuchty and E. Almaas. Peeling the yeast protein network. Proteomics, 5(2):444–449, 2005.
- [Yanardag and Vishwanathan, 2015] P. Yanardag and S.V.N. Vishwanathan. Deep Graph Kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1365–1374, 2015.

Best Paper

Best Paper of IJCAI, 2018

Tags

Comments