# Can graph neural networks count substructures?

NeurIPS 2020, 2020.

Keywords:

weisfeiler lehmanmean squared errorgraph isomorphism testingFolklore Weisfeiler-LehmanInvariant Graph NetworksMore(13+)

Weibo:

Abstract:

The ability to detect and count certain substructures in graphs is important for solving many tasks on graph-structured data, especially in the contexts of computational chemistry and biology as well as social network analysis. Inspired by this, we propose to study the expressive power of graph neural networks (GNNs) via their ability t...More

Code:

Data:

Introduction

- Graph neural networks (GNNs) have achieved empirical success on processing data from various fields such as social networks, quantum chemistry, particle physics, knowledge graphs and combinatorial optimization (Scarselli et al, 2008; Bruna et al, 2013; Duvenaud et al, 2015; Kipf and Welling, 2016; Defferrard et al, 2016; Bronstein et al, 2017; Dai et al, 2017; Nowak et al, 2017; Ying et al, 2018; Zhou et al, 2018; Choma et al, 2018; Zhang and Chen, 2018; You et al, 2018a,b, 2019; Yao et al, 2019; Ding et al, 2019; Stokes et al, 2020).
- From the viewpoint of graph isomorphism testing, existing GNNs are in some sense already not far from being maximally powerful, which could make the pursuit of more powerful GNNs appear unnecessary

Highlights

- In recent years, graph neural networks (GNNs) have achieved empirical success on processing data from various fields such as social networks, quantum chemistry, particle physics, knowledge graphs and combinatorial optimization (Scarselli et al, 2008; Bruna et al, 2013; Duvenaud et al, 2015; Kipf and Welling, 2016; Defferrard et al, 2016; Bronstein et al, 2017; Dai et al, 2017; Nowak et al, 2017; Ying et al, 2018; Zhou et al, 2018; Choma et al, 2018; Zhang and Chen, 2018; You et al, 2018a,b, 2019; Yao et al, 2019; Ding et al, 2019; Stokes et al, 2020)
- Instead of performing iterative equivariant aggregations of information as is done in Message Passing Neural Networks (MPNNs) and Invariant Graph Networks (IGNs), we propose a type of locally powerful models based on the observation that substructures present themselves in local neighborhoods known as egonets
- GIN, 2-Invariant Graph Networks (2-IGNs) and spectral GNN (sGNN) produce much smaller test error than the variance of the ground truth counts for the 3-star tasks, consistent with their theoretical power to perform containment-count of stars
- We propose a theoretical framework to study the expressive power of classes of GNNs based on their ability to count substructures
- We prove that neither MPNNs nor 2-IGNs can matching-count any connected structure with 3 or more nodes; k-Invariant Graph Functions (k-IGNs) and k-WL can containment-count and matching-count any pattern of size k
- We build the foundation for using substructure counting as an intuitive and relevant measure of the expressive power of GNNs, and our concrete results for existing GNNs motivate the search for more powerful designs of GNNs

Methods

- The authors verify the theoretical results on two graph-level regression tasks: matching-counting triangles and containment-counting 3-stars, with both patterns unattributed, as illustrated in Figure 3.
- By. Theorem 2 and Corollary 1, MPNNs and 2-IGNs can perform matching-count of triangles.
- Note that since a triangle is a clique, its matching-count and containment-count are equal.
- The authors generate the ground-truth counts of triangles in each graph with an counting algorithm proposed by Shervashidze et al (2009).

Results

- The results on the two tasks are shown in Table 1, measured by the MSE on the test set divided by the variance of the ground truth counts of the pattern computed over all graphs in the dataset.
- The almost-negligible errors of LRP on all the tasks supports the theory that depth-1 LRP is powerful enough for counting triangles and 3-stars, both of which are patterns with radius 1.
- GIN, 2-IGN and sGNN produce much smaller test error than the variance of the ground truth counts for the 3-star tasks, consistent with their theoretical power to perform containment-count of stars.

Conclusion

- The authors propose a theoretical framework to study the expressive power of classes of GNNs based on their ability to count substructures.
- The authors provide an upper bound on the size of “path-shaped” substructures that finite iterations of k-WL can matching-count.
- To establish these results, the authors prove an equivalence between approximating graph functions and discriminating graphs.
- The authors build the foundation for using substructure counting as an intuitive and relevant measure of the expressive power of GNNs, and the concrete results for existing GNNs motivate the search for more powerful designs of GNNs

Summary

## Introduction:

Graph neural networks (GNNs) have achieved empirical success on processing data from various fields such as social networks, quantum chemistry, particle physics, knowledge graphs and combinatorial optimization (Scarselli et al, 2008; Bruna et al, 2013; Duvenaud et al, 2015; Kipf and Welling, 2016; Defferrard et al, 2016; Bronstein et al, 2017; Dai et al, 2017; Nowak et al, 2017; Ying et al, 2018; Zhou et al, 2018; Choma et al, 2018; Zhang and Chen, 2018; You et al, 2018a,b, 2019; Yao et al, 2019; Ding et al, 2019; Stokes et al, 2020).- From the viewpoint of graph isomorphism testing, existing GNNs are in some sense already not far from being maximally powerful, which could make the pursuit of more powerful GNNs appear unnecessary
## Objectives:

The general case can be proved in the same way but with more subscripts.- (In particular, for the counterexamples, (69) can be shown to hold for each of the d0 feature dimensions.) Define a set S = {(1, 2), (2, 1), (1 + m, 2 + m), (2 + m, 1 + m), (1, 2 + m), (2 + m, 1), (1 + m, 2), (2, 1 + m)}, which represents the “special” edges that capture the difference between G[1] and G[2].
## Methods:

The authors verify the theoretical results on two graph-level regression tasks: matching-counting triangles and containment-counting 3-stars, with both patterns unattributed, as illustrated in Figure 3.- By. Theorem 2 and Corollary 1, MPNNs and 2-IGNs can perform matching-count of triangles.
- Note that since a triangle is a clique, its matching-count and containment-count are equal.
- The authors generate the ground-truth counts of triangles in each graph with an counting algorithm proposed by Shervashidze et al (2009).
## Results:

The results on the two tasks are shown in Table 1, measured by the MSE on the test set divided by the variance of the ground truth counts of the pattern computed over all graphs in the dataset.- The almost-negligible errors of LRP on all the tasks supports the theory that depth-1 LRP is powerful enough for counting triangles and 3-stars, both of which are patterns with radius 1.
- GIN, 2-IGN and sGNN produce much smaller test error than the variance of the ground truth counts for the 3-star tasks, consistent with their theoretical power to perform containment-count of stars.
## Conclusion:

The authors propose a theoretical framework to study the expressive power of classes of GNNs based on their ability to count substructures.- The authors provide an upper bound on the size of “path-shaped” substructures that finite iterations of k-WL can matching-count.
- To establish these results, the authors prove an equivalence between approximating graph functions and discriminating graphs.
- The authors build the foundation for using substructure counting as an intuitive and relevant measure of the expressive power of GNNs, and the concrete results for existing GNNs motivate the search for more powerful designs of GNNs

- Table1: Performance of different GNNs on matching-counting triangles and containment-counting 3-stars on the two datasets, measured by test MSE divided by variance of the ground truth counts. Shown here are the best and the median performances of each model over five runs. Note that we select the best out of four variants for each of GCN, GIN and sGNN, and the better out of two variants for 2-IGN. Details of the GNN architectures and raw results can be found in Appendices J, K
- Table2: Test MSE loss for all models with chosen parameters as specified in Appendix J. We run each model for five times and picked the best and the median (3rd best) results for Table 1. Note that each of GCN, GIN and sGNN has four variants while 2-IGN has two variants. The reported rows in Table 1 are bolded here

Funding

- This work is partially supported by the Alfred P
- SV is partly supported by NSF DMS 1913134, EOARD FA9550-18-1-7007 and the Simons Algorithms and Geometry (A&G) Think Tank

Reference

- Alon, N., Dao, P., Hajirasouliha, I., Hormozdiari, F., and Sahinalp, S. C. (2008). Biomolecular network motif counting and discovery by color coding. Bioinformatics, 24(13):i241–i249.
- Arvind, V., Fuhlbruck, F., Kobler, J., and Verbitsky, O. (2018). On weisfeiler-leman invariance: Subgraph counts and related graph properties. arXiv preprint arXiv:1811.04801.
- Babai, L., Erdos, P., and Selkow, S. M. (1980). Random graph isomorphism. SIaM Journal on computing, 9(3):628–635.
- Bronstein, M. M., Bruna, J., LeCun, Y., Szlam, A., and Vandergheynst, P. (2017). Geometric deep learning: Going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42.
- Bruna, J., Zaremba, W., Szlam, A., and LeCun, Y. (2013). Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203.
- Cai, J.-Y., Furer, M., and Immerman, N. (1992). An optimal lower bound on the number of variables for graph identification. Combinatorica, 12(4):389–410.
- Chen, Z., Li, L., and Bruna, J. (2019a). Supervised community detection with line graph neural networks. Internation Conference on Learning Representations.
- Chen, Z., Villar, S., Chen, L., and Bruna, J. (2019b). On the equivalence between graph isomorphism testing and function approximation with gnns. In Advances in Neural Information Processing Systems, pages 15868–15876.
- Choma, N., Monti, F., Gerhardt, L., Palczewski, T., Ronaghi, Z., Prabhat, P., Bhimji, W., Bronstein, M., Klein, S., and Bruna, J. (2018). Graph neural networks for icecube signal classification. In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 386–391. IEEE.
- Dai, H., Khalil, E. B., Zhang, Y., Dilkina, B., and Song, L. (2017). Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv: 1704.01665.
- Defferrard, M., Bresson, X., and Vandergheynst, P. (2016). Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems, pages 3844–3852.
- Deshpande, M., Kuramochi, M., and Karypis, G. (2002). Automated approaches for classifying structures. Technical report, Minnesota University Minneapolis Department of Computer Science.
- Ding, M., Zhou, C., Chen, Q., Yang, H., and Tang, J. (2019). Cognitive graph for multi-hop reading comprehension at scale. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2694–2703, Florence, Italy. Association for Computational Linguistics.
- Duvenaud, D. K., Maclaurin, D., Iparraguirre, J., Bombarell, R., Hirzel, T., Aspuru-Guzik, A., and Adams, R. P. (2015). Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems, pages 2224–2232.
- Furer, M. (2017). On the combinatorial power of the Weisfeiler-Lehman algorithm. arXiv preprint arXiv:1704.01023.
- Garg, V. K., Jegelka, S., and Jaakkola, T. (2020). Generalization and representational limits of graph neural networks.
- Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., and Dahl, G. E. (2017). Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1263–1272. JMLR. org.
- Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
- Jiang, C., Coenen, F., and Zito, M. (2010). Finding frequent subgraphs in longitudinal social network data using a weighted graph mining approach. In International Conference on Advanced Data Mining and Applications, pages 405–416. Springer.
- Jin, W., Barzilay, R., and Jaakkola, T. (2019). Hierarchical graph-to-graph translation for molecules.
- Jin, W., Barzilay, R., and Jaakkola, T. (2020). Composing molecules with multiple property constraints. arXiv preprint arXiv:2002.03244.
- Jin, W., Barzilay, R., and Jaakkola, T. S. (2018). Junction tree variational autoencoder for molecular graph generation. CoRR, abs/1802.04364.
- Keriven, N. and Peyre, G. (2019). Universal invariant and equivariant graph neural networks. arXiv preprint arXiv:1905.04943.
- Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
- Kipf, T. N. and Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
- Koyutrk, M., Grama, A., and Szpankowski, W. (2004). An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics, 20(suppl 1):i200–i207.
- Lemke, T. L. (2003). Review of organic functional groups: introduction to medicinal organic chemistry. Lippincott Williams & Wilkins.
- Liu, S., Chandereng, T., and Liang, Y. (2018). N-gram graph, A novel molecule representation. arXiv preprint arXiv:1806.09206.
- Liu, X., Pan, H., He, M., Song, Y., and Jiang, X. (2019). Neural subgraph isomorphism counting.
- Loukas, A. (2019). What graph neural networks cannot learn: depth vs width. arXiv preprint arXiv:1907.03199.
- Maron, H., Ben-Hamu, H., and Lipman, Y. (2019a). Open problems: Approximation power of invariant graph networks.
- Maron, H., Ben-Hamu, H., Serviansky, H., and Lipman, Y. (2019b). Provably powerful graph networks. In Advances in Neural Information Processing Systems, pages 2153–2164.
- Maron, H., Ben-Hamu, H., Shamir, N., and Lipman, Y. (2018). Invariant and equivariant graph networks.
- Maron, H., Fetaya, E., Segol, N., and Lipman, Y. (2019c). On the universality of invariant networks. arXiv preprint arXiv:1901.09342.
- Monti, F., Otness, K., and Bronstein, M. M. (2018). Motifnet: a motif-based graph convolutional network for directed graphs. CoRR, abs/1802.01572.
- Morgan, H. L. (1965). The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. Journal of Chemical Documentation, 5(2):107–113.
- Morris, C., Ritzert, M., Fey, M., Hamilton, W. L., Lenssen, J. E., Rattan, G., and Grohe, M. (2019). Weisfeiler and leman go neural: Higher-order graph neural networks. Association for the Advancement of Artificial Intelligence.
- Murphy, R. L., Srinivasan, B., Rao, V., and Ribeiro, B. (2019). Relational pooling for graph representations. arXiv preprint arXiv:1903.02541.
- Murray, C. W. and Rees, D. C. (2009). The rise of fragment-based drug discovery. Nature chemistry, 1(3):187.
- Nowak, A., Villar, S., Bandeira, A. S., and Bruna, J. (2017). A note on learning algorithms for quadratic assignment with graph neural networks. arXiv preprint arXiv:1706.07450.
- OBoyle, N. M. and Sayle, R. A. (2016). Comparing structural fingerprints using a literature-based similarity benchmark. Journal of cheminformatics, 8(1):1–14.
- Pope, P., Kolouri, S., Rostrami, M., Martin, C., and Hoffmann, H. (2018). Discovering molecular functional groups using graph convolutional neural networks. arXiv preprint arXiv:1812.00265.
- Preciado, V. M., Draief, M., and Jadbabaie, A. (2012). Structural analysis of viral spreading processes in social and communication networks using egonets.
- Preciado, V. M. and Jadbabaie, A. (2010). From local measurements to network spectral properties: Beyond degree distributions. In 49th IEEE Conference on Decision and Control (CDC), pages 2686–2691. IEEE.
- Rahman, S. A., Bashton, M., Holliday, G. L., Schrader, R., and Thornton, J. M. (2009). Small molecule subgraph detector (smsd) toolkit. Journal of cheminformatics, 1(1):12.
- Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., and Monfardini, G. (2008). The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80.
- Shervashidze, N., Vishwanathan, S., Petri, T., Mehlhorn, K., and Borgwardt, K. (2009). Efficient graphlet kernels for large graph comparison. In Artificial Intelligence and Statistics, pages 488–495.
- Steger, A. and Wormald, N. C. (1999). Generating random regular graphs quickly. Combinatorics, Probability and Computing, 8(4):377–396.
- Stokes, J. M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N. M., MacNair, C. R., French, S., Carfrae, L. A., Bloom-Ackerman, Z., et al. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4):688–702.
- Ulyanov, D., Vedaldi, A., and Lempitsky, V. (2016). Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022.
- Weisfeiler, B. and Leman, A. (1968). The reduction of a graph to canonical form and the algebra which appears therein. Nauchno-Technicheskaya Informatsia, 2(9):12-16.
- Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., and Yu, P. S. (2019). A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596.
- Xu, K., Hu, W., Leskovec, J., and Jegelka, S. (2018a). How powerful are graph neural networks? arXiv preprint arXiv:1810.00826.
- Xu, K., Li, C., Tian, Y., Sonobe, T., Kawarabayashi, K.-i., and Jegelka, S. (2018b). Representation learning on graphs with jumping knowledge networks. arXiv preprint arXiv:1806.03536.
- Yao, W., Bandeira, A. S., and Villar, S. (2019). Experimental performance of graph neural networks on random instances of max-cut. In Wavelets and Sparsity XVIII, volume 11138, page 111380S. International Society for Optics and Photonics.
- Ying, R., Bourgeois, D., You, J., Zitnik, M., and Leskovec, J. (2019). Gnn explainer: A tool for post-hoc explanation of graph neural networks. arXiv preprint arXiv:1903.03894.
- Ying, R., You, J., Morris, C., Ren, X., Hamilton, W. L., and Leskovec, J. (2018). Hierarchical graph representation learning with differentiable pooling. CoRR, abs/1806.08804.
- You, J., Liu, B., Ying, Z., Pande, V., and Leskovec, J. (2018a). Graph convolutional policy network for goal-directed molecular graph generation. In Advances in neural information processing systems, pages 6410–6421.
- You, J., Wu, H., Barrett, C., Ramanujan, R., and Leskovec, J. (2019). G2sat: Learning to generate sat formulas. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems 32, pages 10553–10564. Curran Associates, Inc.
- You, J., Ying, R., Ren, X., Hamilton, W. L., and Leskovec, J. (2018b). Graphrnn: A deep generative model for graphs. CoRR, abs/1802.08773.
- Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R. R., and Smola, A. J. (2017). Deep sets. In Advances in neural information processing systems, pages 3391–3401.
- Zhang, M. and Chen, Y. (2018). Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems, pages 5165–5175.
- Zhou, J., Cui, G., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C., and Sun, M. (2018). Graph neural networks: A review of methods and applications. arXiv preprint arXiv:1812.08434.
- For a reference, see Maron et al. (2019b).
- On one hand, by construction, 2-WL will not be able to distinguish G[1] from G[2]. This is intuitive if we compare the rooted subtrees in the two graphs, as there exists a bijection from V [1] to V [2] that preserves the rooted subtree structure. A rigorous proof is given at the end of this section. In addition, we note that this is also consequence of the direct proof of Corollary 4 given in Appendix I, in which we will show that the same pair of graphs cannot be distinguished by 2-IGNs. Since 2-IGNs are no less powerful than 2-WL (Maron et al., 2019b), this implies that 2-WL cannot distinguish them either.

Tags

Comments