AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
In this paper we introduce deep learning techniques, which have proven successful in natural language processing, into network analysis for the first time

DeepWalk: online learning of social representations

KDD, (2014): 701-710

Cited by: 5636|Views666
EI

Abstract

We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep lea...More

Code:

Data:

0
Introduction
  • The sparsity of a network representation is both a strength and a weakness. Sparsity enables the design of efficient discrete algorithms, but can make it harder to generalize in statistical learning.
  • Social representations are latent features of the vertices that capture neighborhood similarity and community membership.
  • DeepWalk generalizes neural language models to process a special language composed of a set of randomly-generated walks.
  • These neural language models have been used to capture the semantic and syntactic structure of human language [7], and even logical analogies [29]
Highlights
  • The sparsity of a network representation is both a strength and a weakness
  • In this paper we introduce deep learning [3] techniques, which have proven successful in natural language processing, into network analysis for the first time
  • We introduce deep learning as a tool to analyze graphs, to build robust representations that are suitable for statistical modeling
  • We present a generalization of language modeling to explore the graph through a stream of short random walks
  • We propose DeepWalk, a novel approach for learning latent social representations of vertices
  • DeepWalk’s representations are able to outperform all baseline methods while using 60% less training data
  • Our results show that we can create meaningful representations for graphs which are too large for standard spectral methods
Methods
  • To validate the performance of the approach the authors compare it against a number of baselines:

    SpectralClustering [41]: This method generates a representation in Rd from the d-smallest eigenvectors of L, the normalized graph Laplacian of G.
  • Given the neighborhood Ni of vertex vi, wvRN estimates Pr with the weighted mean of its neighbors (i.e
Results
  • DeepWalk’s representations can provide F1 scores up to 10% higher than competing methods when labeled data is sparse.
  • DeepWalk’s representations are able to outperform all baseline methods while using 60% less training data.
  • DeepWalk’s representations can outperform its competitors even when given 60% less training data.
  • The authors' results show that the authors can create meaningful representations for graphs which are too large for standard spectral methods.
  • On such large graphs, the method significantly outperforms other methods designed to operate for sparsity
Conclusion
  • The authors propose DeepWalk, a novel approach for learning latent social representations of vertices.
  • Using local information from truncated random walks as input, the method learns a representation which encodes structural regularities.
  • The authors' results show that the authors can create meaningful representations for graphs which are too large for standard spectral methods.
  • On such large graphs, the method significantly outperforms other methods designed to operate for sparsity.
  • The authors show that the approach is parallelizable, allowing workers to update different parts of the model concurrently
Tables
  • Table1: Graphs used in our experiments
  • Table2: Multi-label classification results in BlogCatalog
  • Table3: Multi-label classification results in Flickr
  • Table4: Multi-label classification results in YouTube
Download tables as Excel
Related work
  • The main differences between our proposed method and previous work can be summarized as follows: 1. We learn our latent social representations, instead of computing statistics related to centrality [13] or partitioning [41].

    2. We do not attempt to extend the classification procedure itself (through collective inference [37] or graph kernels [21]).

    3. We propose a scalable online method which uses only local information. Most methods require global information and are offline [17, 39,40,41].

    4. We apply unsupervised representation learning to graphs. In this section we discuss related work in network classification and unsupervised feature learning.
Funding
  • This research was partially supported by NSF Grants DBI-1060572 and IIS-1017181, and a Google Faculty Research Award
Reference
  • R. Al-Rfou, B. Perozzi, and S. Skiena. Polyglot: Distributed word representations for multilingual nlp. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 183–192, Sofia, Bulgaria, August 2013. ACL.
    Google ScholarLocate open access versionFindings
  • R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank vectors. In Foundations of Computer Science, 2006. FOCS’06. 47th Annual IEEE Symposium on, pages 475–486. IEEE, 2006.
    Google ScholarLocate open access versionFindings
  • Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. 2013.
    Google ScholarFindings
  • Y. Bengio, R. Ducharme, and P. Vincent. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137–1155, 2003.
    Google ScholarLocate open access versionFindings
  • L. Bottou. Stochastic gradient learning in neural networks. In Proceedings of Neuro-Nımes 91, Nimes, France, 1991. EC2.
    Google ScholarLocate open access versionFindings
  • V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3):15, 2009.
    Google ScholarLocate open access versionFindings
  • R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th ICML, ICML ’08, pages 160–16ACM, 2008.
    Google ScholarLocate open access versionFindings
  • G. E. Dahl, D. Yu, L. Deng, and A. Acero. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 20(1):30–42, 2012.
    Google ScholarLocate open access versionFindings
  • J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. Le, M. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Ng. Large scale distributed deep networks. In P. Bartlett, F. Pereira, C. Burges, L. Bottou, and K. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1232–1240. 2012.
    Google ScholarLocate open access versionFindings
  • D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, and S. Bengio. Why does unsupervised pre-training help deep learning? The Journal of Machine Learning Research, 11:625–660, 2010.
    Google ScholarLocate open access versionFindings
  • R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871–1874, 2008.
    Google ScholarLocate open access versionFindings
  • F. Fouss, A. Pirotte, J.-M. Renders, and M. Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. Knowledge and Data Engineering, IEEE Transactions on, 19(3):355–369, 2007.
    Google ScholarLocate open access versionFindings
  • B. Gallagher and T. Eliassi-Rad. Leveraging label-independent features for classification in sparsely labeled networks: An empirical study. In Advances in Social Network Mining and Analysis, pages 1–19.
    Google ScholarLocate open access versionFindings
  • B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos. Using ghost edges for classification in sparsely labeled networks. In Proceedings of the 14th ACM SIGKDD, KDD ’08, pages 256–264, New York, NY, USA, 2008. ACM.
    Google ScholarLocate open access versionFindings
  • S. Geman and D. Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (6):721–741, 1984.
    Google ScholarLocate open access versionFindings
  • L. Getoor and B. Taskar. Introduction to statistical relational learning. MIT press, 2007.
    Google ScholarFindings
  • K. Henderson, B. Gallagher, L. Li, L. Akoglu, T. Eliassi-Rad, H. Tong, and C. Faloutsos. It’s who you know: Graph mining using recursive structural features. In Proceedings of the 17th ACM SIGKDD, KDD ’11, pages 663–671, New York, NY, USA, 2011. ACM.
    Google ScholarLocate open access versionFindings
  • G. E. Hinton. Learning distributed representations of concepts. In Proceedings of the eighth annual conference of the cognitive science society, pages 1–12. Amherst, MA, 1986.
    Google ScholarLocate open access versionFindings
  • R. A. Hummel and S. W. Zucker. On the foundations of relaxation labeling processes. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (3):267–287, 1983.
    Google ScholarFindings
  • U. Kang, H. Tong, and J. Sun. Fast random walk graph kernel. In SDM, pages 828–838, 2012.
    Google ScholarLocate open access versionFindings
  • R. I. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In ICML, volume 2, pages 315–322, 2002.
    Google ScholarLocate open access versionFindings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, volume 1, page 4, 2012.
    Google ScholarLocate open access versionFindings
  • D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7):1019–1031, 2007.
    Google ScholarLocate open access versionFindings
  • F. Lin and W. Cohen. Semi-supervised classification of network data using very few labels. In Advances in Social Networks Analysis and Mining (ASONAM), 2010 International Conference on, pages 192–199, Aug 2010.
    Google ScholarLocate open access versionFindings
  • S. A. Macskassy and F. Provost. A simple relational classifier. In Proceedings of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003, pages 64–76, 2003.
    Google ScholarLocate open access versionFindings
  • S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. The Journal of Machine Learning Research, 8:935–983, 2007.
    Google ScholarLocate open access versionFindings
  • T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. CoRR, abs/1301.3781, 2013.
    Findings
  • T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26, pages 3111–3119. 2013.
    Google ScholarLocate open access versionFindings
  • T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. In Proceedings of NAACL-HLT, pages 746–751, 2013.
    Google ScholarLocate open access versionFindings
  • A. Mnih and G. E. Hinton. A scalable hierarchical distributed language model. Advances in neural information processing systems, 21:1081–1088, 2009.
    Google ScholarLocate open access versionFindings
  • F. Morin and Y. Bengio. Hierarchical probabilistic neural network language model. In Proceedings of the international workshop on artificial intelligence and statistics, pages 246–252, 2005.
    Google ScholarLocate open access versionFindings
  • J. Neville and D. Jensen. Iterative classification in relational data. In Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data, pages 13–20, 2000.
    Google ScholarLocate open access versionFindings
  • J. Neville and D. Jensen. Leveraging relational autocorrelation with latent group models. In Proceedings of the 4th International Workshop on Multi-relational Mining, MRDM ’05, pages 49–55, New York, NY, USA, 2005. ACM.
    Google ScholarLocate open access versionFindings
  • J. Neville and D. Jensen. A bias/variance decomposition for models using collective inference. Machine Learning, 73(1):87–106, 2008.
    Google ScholarLocate open access versionFindings
  • M. E. Newman. Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23):8577–8582, 2006.
    Google ScholarLocate open access versionFindings
  • B. Recht, C. Re, S. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems 24, pages 693–701. 2011.
    Google ScholarLocate open access versionFindings
  • P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad. Collective classification in network data. AI magazine, 29(3):93, 2008.
    Google ScholarLocate open access versionFindings
  • D. A. Spielman and S.-H. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 81–90. ACM, 2004.
    Google ScholarLocate open access versionFindings
  • L. Tang and H. Liu. Relational learning via latent social dimensions. In Proceedings of the 15th ACM SIGKDD, KDD ’09, pages 817–826, New York, NY, USA, 2009. ACM.
    Google ScholarLocate open access versionFindings
  • L. Tang and H. Liu. Scalable learning of collective behavior based on sparse social dimensions. In Proceedings of the 18th ACM conference on Information and knowledge management, pages 1107–1116. ACM, 2009.
    Google ScholarLocate open access versionFindings
  • L. Tang and H. Liu. Leveraging social media networks for classification. Data Mining and Knowledge Discovery, 23(3):447–478, 2011.
    Google ScholarLocate open access versionFindings
  • S. Vishwanathan, N. N. Schraudolph, R. Kondor, and K. M. Borgwardt. Graph kernels. The Journal of Machine Learning Research, 99:1201–1242, 2010.
    Google ScholarLocate open access versionFindings
  • X. Wang and G. Sukthankar. Multi-label relational neighbor classification using social context features. In Proceedings of the 19th ACM SIGKDD, pages 464–472. ACM, 2013.
    Google ScholarLocate open access versionFindings
  • W. Zachary. An information flow model for conflict and fission in small groups1. Journal of anthropological research, 33(4):452–473, 1977.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科