Graph Adversarial Training: Dynamically Regularizing Based on Graph Structure

arXiv: Learning, pp. 1-1, 2019.

Cited by: 17|Bibtex|Views285|Links
EI
Keywords:
Graph-based Learningadversarial traininginput featuresupervised lossnode classificationMore(13+)
Weibo:
We proposed a new learning method, named graph adversarial training, which additional accounts for relation between examples as compared to standard adversarial training

Abstract:

Recent efforts show that neural networks are vulnerable to small but intentional perturbations on input features in visual classification tasks. Due to the additional consideration of connections between examples (e.g., articles with citation link tend to be in the same class), graph neural networks could be more sensitive to the perturba...More

Code:

Data:

0
Introduction
  • Graph-based learning makes predictions by accounting for both input features of examples and the relations between examples.
  • In addition to the supervised loss on labeled examples, graph-based learning optimizes the smoothness of predictions over the graph structure, that is, closely connected examples are encouraged to have similar predictions [8]–[11].
  • The reasons are twofold: 1) graph neural networks optimize the supervised loss on labeled data, it will face the same vulnerability issue as the standard neural networks [14], and
Highlights
  • Graph-based learning makes predictions by accounting for both input features of examples and the relations between examples
  • We demonstrate the effectiveness of Graph Adversarial Training on Graph Convolutional Network, conducting experiments on three datasets which show that our method achieves state-of-the-art performance for node classification
  • Inspired by the and 2) Graph Adversarial Training to some extent augments the training data, since philosophy of standard Adversarial Training, we develop graph adversarial the generated adversarial examples have not occurred in the training, which trains graph neural network modules in the manner of generating adversarial examples and optimizing additional regularization terms over the adversarial examples, so as to prevent the adverse effects of perturbations
  • Node Degree We study how the graph adversarial training performs on nodes with different densities of connections so as to understand where this regularization technique is suitable for
  • We proposed a new learning method, named graph adversarial training, which additional accounts for relation between examples as compared to standard adversarial training
  • It beats Graph Convolutional Network trained with VAT, indicating the necessity of performing Adversarial Training with graph structure considered
Methods
  • The authors employ the public implementation3 of GCN with same settings as the origin paper to report its performance on Citeseer and Cora.
  • In most cases, GCN-GATV outperforms the standard GCN, which indicates that graph adversarial training would benefit the prediction of nodes with different degrees and is roughly not sensitive to the density of graph.
  • For one of the exceptions, the authors speculated that the reason is the under-fitting of standard GCN on such nodes, where additional regularization performed by graph adversarial training worsens it.
  • It suggests joint consideration of node features and the graph structure in adversarial training on graph data
Results
  • The authors demonstrate the effectiveness of GAT on GCN, conducting experiments on three datasets which show that the method achieves state-of-the-art performance for node classification.
  • For one of the exceptions, the authors speculated that the reason is the under-fitting of standard GCN on such nodes, where additional regularization performed by graph adversarial training worsens it.
  • On the NELL dataset, the performance of GCN-GAT is 1.58% worse than GCN-VAT, which is reasonable
Conclusion
  • The authors proposed a new learning method, named graph adversarial training, which additional accounts for relation between examples as compared to standard adversarial training.
  • By conducting experiments on three benchmark datasets, the authors demonstrated that training GCN with the method is remarkably effective, achieving an average improvement of 4.51%.
  • It beats GCN trained with VAT, indicating the necessity of performing AT with graph structure considered
Summary
  • Introduction:

    Graph-based learning makes predictions by accounting for both input features of examples and the relations between examples.
  • In addition to the supervised loss on labeled examples, graph-based learning optimizes the smoothness of predictions over the graph structure, that is, closely connected examples are encouraged to have similar predictions [8]–[11].
  • The reasons are twofold: 1) graph neural networks optimize the supervised loss on labeled data, it will face the same vulnerability issue as the standard neural networks [14], and
  • Methods:

    The authors employ the public implementation3 of GCN with same settings as the origin paper to report its performance on Citeseer and Cora.
  • In most cases, GCN-GATV outperforms the standard GCN, which indicates that graph adversarial training would benefit the prediction of nodes with different degrees and is roughly not sensitive to the density of graph.
  • For one of the exceptions, the authors speculated that the reason is the under-fitting of standard GCN on such nodes, where additional regularization performed by graph adversarial training worsens it.
  • It suggests joint consideration of node features and the graph structure in adversarial training on graph data
  • Results:

    The authors demonstrate the effectiveness of GAT on GCN, conducting experiments on three datasets which show that the method achieves state-of-the-art performance for node classification.
  • For one of the exceptions, the authors speculated that the reason is the under-fitting of standard GCN on such nodes, where additional regularization performed by graph adversarial training worsens it.
  • On the NELL dataset, the performance of GCN-GAT is 1.58% worse than GCN-VAT, which is reasonable
  • Conclusion:

    The authors proposed a new learning method, named graph adversarial training, which additional accounts for relation between examples as compared to standard adversarial training.
  • By conducting experiments on three benchmark datasets, the authors demonstrated that training GCN with the method is remarkably effective, achieving an average improvement of 4.51%.
  • It beats GCN trained with VAT, indicating the necessity of performing AT with graph structure considered
Tables
  • Table1: Statistics of the experiment datasets
  • Table2: Performance of the compared methods on the three datasets w.r.t. accuracy
  • Table3: Effect of graph adversarial regularization and virtual adversarial regularization
  • Table4: Performance comparison of GCN-GAT with different strategies to sample neighbors during adversarial example generation
  • Table5: Performance of GCN-GAT as tuning all hyperparameters (i.e., β, ǫ, and k) and tuning ǫ with fixed β = 1.0 and k = 1
  • Table6: Average training time per epoch of GCN,
  • Table7: The impact of adding graph adversarial perturbations to GCN and GCN-GAT. The number shows the relative decrease of accuracy
  • Table8: Average Kullback-Leibler divergence between connected node pairs calculated from predictions of GCN
Download tables as Excel
Related work
  • In this section, we discuss the existing research on graphbased learning and adversarial learning, which are closely related to this work.

    2.1 Graph-based Learning

    Graph, a natural representation of relational data, in which nodes and edges represent entities and their relations, is widely used in the analysis of social networks, transaction records, biological interactions, collections of interlinked documents, web pages, and multimedia contents, etc.. On such graphs, one of the most popular tasks is node classification targeting to predicting the label of nodes in the graph by accounting for node features and the graph structure. The existing work on node classification mainly fall into two broad categories: graph Laplacian regularization and graph embedding-based methods. Methods lying in the former category explicitly encode the graph structure as a regularization term to smooth the predictions over the graph, i.e., the regularization incurs a large penalty when similar nodes (e.g., closely connected) are predicted with different labels [8], [9], [17]–[19].

    Recently, graph embedding-based methods, which learn node embeddings that encodes the graph data, have become promising solution. Most of embedding-based methods fall into two broad categories: skip-gram based methods and convolution based methods, depending on how the graph data are modeled. The skip-gram based methods learn node embeddings via using the embedding of a node to predict node context that are generated by performing random walk on the graph so as the embeddings of ”connected” nodes are associated to each other [2], [5], [6], [12]. Inspired by the idea of convolution in computer vision, which aggregates contextual signals in a local window, convolution based methods iteratively aggregate representation of neighbor nodes to learn a node embedding [3], [4], [7], [11], [20]–[23].
Reference
  • D. Wang, P. Cui, and W. Zhu, “Structural deep network embedding,” in SIGKDD. ACM, 2016, pp. 1225–1234.
    Google ScholarLocate open access versionFindings
  • A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in SIGKDD. ACM, 2016, pp. 855–864.
    Google ScholarLocate open access versionFindings
  • W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in Advances in Neural Information Processing Systems, 2017, pp. 1024–1034.
    Google ScholarLocate open access versionFindings
  • R. Ying, J. You, C. Morris, X. Ren, W. L. Hamilton, and J. Leskovec, “Hierarchical graph representation learning with differentiable pooling,” arXiv preprint arXiv:1806.08804, 2018.
    Findings
  • B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014, pp. 701–710.
    Google ScholarLocate open access versionFindings
  • J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line: Large-scale information network embedding,” in Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2015, pp. 1067– 1077.
    Google ScholarLocate open access versionFindings
  • T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • X. Zhu, Z. Ghahramani, and J. D. Lafferty, “Semi-supervised learning using gaussian fields and harmonic functions,” in Proceedings of the 20th International conference on Machine learning (ICML-03), 2003, pp. 912–919.
    Google ScholarLocate open access versionFindings
  • D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf, “Learning with local and global consistency,” in Advances in neural information processing systems, 2004, pp. 321–328.
    Google ScholarFindings
  • J. Ni, S. Chang, X. Liu, W. Cheng, H. Chen, D. Xu, and X. Zhang, “Co-regularized deep multi-network embedding,” in Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2018, pp. 469–478.
    Google ScholarLocate open access versionFindings
  • P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” ICLR, vol. 1, no. 2, 2018.
    Google ScholarLocate open access versionFindings
  • Z. Yang, W. Cohen, and R. Salakhudinov, “Revisiting semisupervised learning with graph embeddings,” in International Conference on Machine Learning, 2016, pp. 40–48.
    Google ScholarLocate open access versionFindings
  • D. Zugner, A. Akbarnejad, and S. Gunnemann, “Adversarial attacks on neural networks for graph data,” in SIGKDD. ACM, 2018, pp. 2847–2856.
    Google ScholarLocate open access versionFindings
  • I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” ICLR, 2015.
    Google ScholarLocate open access versionFindings
  • A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,” ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • T. Miyato, A. M. Dai, and I. Goodfellow, “Adversarial training methods for semi-supervised text classification,” ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold regularization: A geometric framework for learning from labeled and unlabeled examples,” Journal of machine learning research, pp. 2399–2434, 2006.
    Google ScholarLocate open access versionFindings
  • P. P. Talukdar and K. Crammer, “New regularized algorithms for transductive learning,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2009, pp. 442–457.
    Google ScholarLocate open access versionFindings
  • F. Feng, X. He, Y. Liu, L. Nie, and T.-S. Chua, “Learning on partialorder hypergraphs,” in Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2018, pp. 1523–1532.
    Google ScholarLocate open access versionFindings
  • J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and locally connected networks on graphs,” ICLR, 2014.
    Google ScholarLocate open access versionFindings
  • D. K. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams, “Convolutional networks on graphs for learning molecular fingerprints,” in Advances in neural information processing systems, 2015, pp. 2224–2232.
    Google ScholarFindings
  • M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in Advances in Neural Information Processing Systems, 2016, pp. 3844–3852.
    Google ScholarLocate open access versionFindings
  • R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, “Graph convolutional neural networks for web-scale recommender systems,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018, pp. 974–983.
    Google ScholarLocate open access versionFindings
  • H. Dai, H. Li, T. Tian, X. Huang, L. Wang, J. Zhu, and L. Song, “Adversarial attack on graph structured data,” in ICML, vol.
    Google ScholarLocate open access versionFindings
  • 80. PMLR, 2018, pp. 1115–1124.
    Google ScholarFindings
  • [25] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” ICLR, 2014.
    Google ScholarLocate open access versionFindings
  • [26] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Universal adversarial perturbations,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
    Google ScholarLocate open access versionFindings
  • [27] Y. Wu, D. Bamman, and S. Russell, “Adversarial training for relation extraction,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 1778–1783.
    Google ScholarLocate open access versionFindings
  • [28] T. Miyato, S.-i. Maeda, S. Ishii, and M. Koyama, “Virtual adversarial training: a regularization method for supervised and semi-supervised learning,” IEEE transactions on pattern analysis and machine intelligence, 2018.
    Google ScholarLocate open access versionFindings
  • [29] F. Liao, M. Liang, Y. Dong, and T. Pang, “Defense against adversarial attacks using high-level representation guided denoiser,” CVPR, 2018.
    Google ScholarLocate open access versionFindings
  • [30] F. Tramer, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, “Ensemble adversarial training: Attacks and defenses,” ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • [31] A. Raghunathan, J. Steinhardt, and P. Liang, “Certified defenses against adversarial examples,” ICLR, 2019.
    Google ScholarLocate open access versionFindings
  • [32] H. Wang, J. Wang, J. Wang, M. Zhao, W. Zhang, F. Zhang, X. Xie, and M. Guo, “Graphgan: Graph representation learning with generative adversarial nets,” AAAI, 2017.
    Google ScholarLocate open access versionFindings
  • [33] M. Ding, J. Tang, and J. Zhang, “Semi-supervised learning on graphs with generative adversarial nets,” in Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2018, pp. 913–922.
    Google ScholarLocate open access versionFindings
  • [34] L. Sang, M. Xu, S. Qian, and X. Wu, “Aaane: Attention-based adversarial autoencoder for multi-scale network embedding,” AAAI, 2018.
    Google ScholarLocate open access versionFindings
  • [35] W. Yu, C. Zheng, W. Cheng, C. C. Aggarwal, D. Song, B. Zong, H. Chen, and W. Wang, “Learning deep network representations with adversarially regularized autoencoders,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018, pp. 2663–2671.
    Google ScholarLocate open access versionFindings
  • [36] S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang, “Adversarially regularized graph autoencoder for graph embedding.” in IJCAI, 2018, pp. 2609–2615.
    Google ScholarFindings
  • [37] Q. Dai, Q. Li, J. Tang, and D. Wang, “Adversarial network embedding,” AAAI, 2018.
    Google ScholarLocate open access versionFindings
  • [38] J. M. Joyce, “Kullback-leibler divergence,” Alphascript Publishing, p. 844, 2013.
    Google ScholarLocate open access versionFindings
  • [39] L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web.” Stanford InfoLab, Tech. Rep., 1999.
    Google ScholarLocate open access versionFindings
  • [40] P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. EliassiRad, “Collective classification in network data,” AI magazine, vol. 29, no. 3, p. 93, 2008.
    Google ScholarLocate open access versionFindings
  • [41] J. Weston, F. Ratle, H. Mobahi, and R. Collobert, “Deep learning via semi-supervised embedding,” in Neural Networks: Tricks of the Trade. Springer, 2012, pp. 639–655.
    Google ScholarFindings
  • [42] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • [43] X. He, Z. He, X. Du, and T.-S. Chua, “Adversarial personalized ranking for recommendation,” in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 2018, pp. 355–364.
    Google ScholarLocate open access versionFindings
  • [44] H. Gao, Z. Wang, and S. Ji, “Large-scale learnable graph convolutional networks,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018, pp. 1416–1424.
    Google ScholarLocate open access versionFindings
  • [45] P. Cui, X. Wang, J. Pei, and W. Zhu, “A survey on network embedding,” IEEE Transactions on Knowledge and Data Engineering, 2018.
    Google ScholarLocate open access versionFindings
  • [46] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
    Findings
  • [47] S. Park, J.-K. Park, S.-J. Shin, and I.-C. Moon, “Adversarial dropout for supervised and semi-supervised learning,” AAAI, 2018. Fuli Feng is a Ph.D. student in the School of Computing, National University of Singapore. He received the B.E. degree in School of Computer Science and Engineering from Baihang University, Beijing, in 2015. His research interests include information retrieval, data mining, and multi-media processing. He has over 10 publications appeared in several top conferences such as SIGIR, WWW, and MM. His work on Bayesian Personalized Ranking has received the Best Poster Award of WWW 2018.
    Google ScholarLocate open access versionFindings
  • Jie Tang is an associate professor with the Department of Computer Science and Technology, Tsinghua University. His main research interests include data mining algorithms and social network theories. He has been a visiting scholar with Cornell University, Chinese University of Hong Kong, Hong Kong University of Science and Technology, and Leuven University. He has published more then 100 research papers in major international journals and conferences including: KDD, IJCAI, AAAI, ICML, WWW, SIGIR, SIGMOD, ACL, Machine Learning Journal, TKDD, and TKDE.
    Google ScholarLocate open access versionFindings
  • Xiangnan He is currently a research fellow with School of Computing, National University of Singapore (NUS). He received his Ph.D. in Computer Science from NUS. His research interests span recommender system, information retrieval, natural language processing and multimedia. His work on recommender system has received the Best Paper Award Honorable Mention in WWW 2018 and SIGIR 2016.
    Google ScholarFindings
Your rating :
0

 

Tags
Comments