# Incomplete Network Alignment: Problem Definitions and Fast Solutions

ACM Transactions on Knowledge Discovery from Data, pp. 1-26, 2020.

EI

Keywords:

Incomplete network alignmentlow ranknetwork completion

Weibo:

Abstract:

Networks are prevalent in many areas and are often collected from multiple sources. However, due to the veracity characteristics, more often than not, networks are incomplete. Network alignment and network completion have become two fundamental cornerstones behind a wealth of high-impact graph mining applications. The state-of-the-art hav...More

Code:

Data:

Introduction

- Networks are prevalent and naturally appear in many areas. More often than not, in the big data era, networks in many high-impact applications are collected from multiple sources (i.e., variety), such as social networks from different social platforms, protein–protein interaction (PPI) networks from multiple tissues, and transaction networks from multiple financial institutes.
- In the big data era, networks in many high-impact applications are collected from multiple sources, such as social networks from different social platforms, protein–protein interaction (PPI) networks from multiple tissues, and transaction networks from multiple financial institutes.
- In order to integrate the considerable information associated with multiple networks, network alignment is of key importance to find the node correspondence across networks.
- By aligning the same users in different transaction networks, the transaction patterns of users can be comprehended to enhance the financial fraud detection.
- Network completion has become another key task which benefits many graph mining applications by providing higher-quality networks if handled properly

Highlights

- Networks are prevalent and naturally appear in many areas
- In the big data era, networks in many high-impact applications are collected from multiple sources, such as social networks from different social platforms, protein–protein interaction (PPI) networks from multiple tissues, and transaction networks from multiple financial institutes
- In order to integrate the considerable information associated with multiple networks, network alignment is of key importance to find the node correspondence across networks
- The multi-sourced and incomplete characteristics often co-exist in many real networks, the state-of-the-arts have been largely addressing network alignment and network completion problems in parallel
- The empirical evaluations demonstrate the effectiveness and efficiency of the proposed iNeAt algorithm. It (1) improves the alignment accuracy by up to 30% over the existing network alignment methods, in the it leads a better imputation outcome; and (2) achieves a good quality-speed balance and scales linearly w.r.t the number of nodes in the networks
- The proposed algorithm is proved to converge to the KKT fixed point with a linear complexity in both time and space

Methods

- The authors evaluate the effectiveness and efficiency of the proposed algorithm by extensive experiments.
- The rest of the article is organized as follows.
- Section 2 defines the incomplete network alignment problem and provides some preliminaries of the article.
- Section 3 presents the proposed optimization formulation of iNeAt and Section 4 gives an effective optimization algorithm, followed by some analyses.
- Section 5 presents the experimental results.
- Related work and conclusion are given in Sections 6 and 7

Results

- The authors present the experimental results of the proposed algorithm iNeAt. The authors evaluate the algorithm in the following two aspects:.
- — Effectiveness: How accurate is the algorithm for aligning incomplete networks?
- How effective is the algorithm to recover missing edges by leveraging the alignment result?.
- — Efficiency: How fast and scalable is the algorithm?
- — Effectiveness: How accurate is the algorithm for aligning incomplete networks? How effective is the algorithm to recover missing edges by leveraging the alignment result?

Conclusion

- In the era of big data, the multi-sourced and incomplete characteristics often co-exist in many real networks.
- The empirical evaluations demonstrate the effectiveness and efficiency of the proposed iNeAt algorithm.
- It (1) improves the alignment accuracy by up to 30% over the existing network alignment methods, in the it leads a better imputation outcome; and (2) achieves a good quality-speed balance and scales linearly w.r.t the number of nodes in the networks.
- Future work includes extending the algorithm to handle attributed networks and other ways to leverage the prior knowledge

Summary

## Introduction:

Networks are prevalent and naturally appear in many areas. More often than not, in the big data era, networks in many high-impact applications are collected from multiple sources (i.e., variety), such as social networks from different social platforms, protein–protein interaction (PPI) networks from multiple tissues, and transaction networks from multiple financial institutes.- In the big data era, networks in many high-impact applications are collected from multiple sources, such as social networks from different social platforms, protein–protein interaction (PPI) networks from multiple tissues, and transaction networks from multiple financial institutes.
- In order to integrate the considerable information associated with multiple networks, network alignment is of key importance to find the node correspondence across networks.
- By aligning the same users in different transaction networks, the transaction patterns of users can be comprehended to enhance the financial fraud detection.
- Network completion has become another key task which benefits many graph mining applications by providing higher-quality networks if handled properly
## Methods:

The authors evaluate the effectiveness and efficiency of the proposed algorithm by extensive experiments.- The rest of the article is organized as follows.
- Section 2 defines the incomplete network alignment problem and provides some preliminaries of the article.
- Section 3 presents the proposed optimization formulation of iNeAt and Section 4 gives an effective optimization algorithm, followed by some analyses.
- Section 5 presents the experimental results.
- Related work and conclusion are given in Sections 6 and 7
## Results:

The authors present the experimental results of the proposed algorithm iNeAt. The authors evaluate the algorithm in the following two aspects:.- — Effectiveness: How accurate is the algorithm for aligning incomplete networks?
- How effective is the algorithm to recover missing edges by leveraging the alignment result?.
- — Efficiency: How fast and scalable is the algorithm?
- — Effectiveness: How accurate is the algorithm for aligning incomplete networks? How effective is the algorithm to recover missing edges by leveraging the alignment result?
## Conclusion:

In the era of big data, the multi-sourced and incomplete characteristics often co-exist in many real networks.- The empirical evaluations demonstrate the effectiveness and efficiency of the proposed iNeAt algorithm.
- It (1) improves the alignment accuracy by up to 30% over the existing network alignment methods, in the it leads a better imputation outcome; and (2) achieves a good quality-speed balance and scales linearly w.r.t the number of nodes in the networks.
- Future work includes extending the algorithm to handle attributed networks and other ways to leverage the prior knowledge

- Table1: Symbols and Notations
- Table2: Statistics of Datasets

Related work

- Network Alignment. In general, network alignment has two categories, i.e., local alignment and global alignment. Among others, local network alignment aims to uncover the alignment among small regions across multiple networks, such as motifs and small subgraphs. Some recent works include [4, 26, 30]. Nevertheless, local network alignment might be too restrictive to effectively find the node correspondence. On the other side, many global network alignment algorithms that targets to find node alignment, are based on the topology consistency. For example, one early wellknown approach IsoRank computes the cross-network pairwise topology similarities by propagating the similarities of their neighboring node pairs and it is shown that this can be formulated as a random-walk propagation procedure in the Kronecker product graph [34]. In addition, IsoRankN [20] extends the original IsoRank algorithm by using PageRank-Nibble [1] to align multiple networks. BigAlign [16] and UMA [42] both assume that one network is a noisy permutation of the other network, whereas [42] is generalized to align multiple networks by adding the transitivity constraints. NetAlign formulates the network alignment problem as an optimization problem to maximize the number of aligned neighboring node pairs [3] and solve it based on a beliefpropagation heuristic.

Funding

- This work is supported by the National Science Foundation under grant No 1947135 and 1715385, by the NSF Program on Fairness in AI in collaboration with Amazon under award No 1939725, by the United States Air Force and DARPA under contract number FA8750-17-C-0153, and Department of Homeland Security under Grant Award Number 2017-ST061-QA0001

Reference

- Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. InProceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, (FOCS’06). IEEE, 475–486.
- Nicola Barbieri, Francesco Bonchi, and Giuseppe Manco. 2014. Who to follow and why: Link prediction with explanations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1266–1275.
- Mohsen Bayati, David F. Gleich, Amin Saberi, and Ying Wang. 201Message-passing algorithms for sparse network alignment. ACM Transactions on Knowledge Discovery from Data 7, 1 (2013), 3.
- Johannes Berg and Michael Lässig. 200Local graph alignment and motif search in biological networks. Proceedings of the National Academy of Sciences of the United States of America 101, 41 (2004), 14689–14694.
- Nicolas Boumal and Pierre-antoine Absil. 2011. RTRMC: A riemannian trust-region method for low-rank matrix completion. In Proceedings of the Advances in Neural Information Processing Systems. 406–414.
- Ulrik Brandes. 2008. On variants of shortest-path betweenness centrality and their generic computation. Social Networks 30, 2 (2008), 136–145.
- Jian-Feng Cai, Emmanuel J. Candès, and Zuowei Shen. 2010. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization 20, 4 (2010), 1956–1982.
- Zheng Chen, Xinli Yu, Bo Song, Jianliang Gao, Xiaohua Hu, and Wei-Shih Yang. 2017. Community-based network alignment for large attributed network. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 587–596.
- Chris Ding, Xiaofeng He, and Horst D. Simon. 2005. On the equivalence of nonnegative matrix factorization and spectral clustering. In Proceedings of the 2005 SIAM International Conference on Data Mining. SIAM, 606–610.
- Boxin Du and Hanghang Tong. 2018. FASTEN: Fast sylvester equation solver for graph mining. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1339–1347.
- Boxin Du and Hanghang Tong. 2019. MrMine: Multi-resolution Multi-network embedding. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 479–488.
- Mohammed El-Kebir, Jaap Heringa, and Gunnar W. Klau. 2015. Natalie 2.0: Sparse global network alignment as a special case of quadratic assignment. Algorithms 8, 4 (2015), 1035–1051.
- Somaye Hashemifar and Jinbo Xu. 2014. Hubalign: An accurate and efficient method for global alignment of protein– protein interaction networks. Bioinformatics 30, 17 (2014), i438–i444.
- Mark Heimann, Haoming Shen, and Danai Koutra. 2018. Node representation learning for multiple networks: The case of graph alignment. Arxiv Preprint Arxiv:1802.06257 (2018).
- Xiangnan Kong, Jiawei Zhang, and Philip S. Yu. 2013. Inferring anchor links across multiple heterogeneous social networks. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. ACM, 179–188.
- Danai Koutra, Hanghang Tong, and David Lubensky. 2013. Big-align: Fast bipartite graph alignment. In Proceedings of the 2013 IEEE 13th International Conference on Data Mining. IEEE, 389–398.
- Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, and Zoubin Ghahramani. 2010. Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research 11, Feb (2010), 985–1042.
- Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data 1, 1 (2007), 2.
- Jure Leskovec and Julian J. Mcauley. 2012. Learning to discover social circles in ego networks. In Proceedings of the Advances in Neural Information Processing Systems. 539–547.
- Chung-Shou Liao, Kanghao Lu, Michael Baym, Rohit Singh, and Bonnie Berger. 2009. IsoRankN: Spectral methods for global alignment of multiple protein networks. Bioinformatics 25, 12 (2009), i253–i258.
- David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. Journal of the Association for Information Science and Technology 58, 7 (2007), 1019–1031.
- Ji Liu, Przemyslaw Musialski, Peter Wonka, and Jieping Ye. 2013. Tensor completion for estimating missing values in visual data. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (2013), 208–220.
- Li Liu, William K. Cheung, Xin Li, and Lejian Liao. 2016. Aligning users across social networks using network embedding. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. 1774–1780.
- Yuanyuan Liu, Fanhua Shang, Hong Cheng, James Cheng, and Hanghang Tong. 2014. Factor matrix trace norm minimization for low-rank tensor completion. In Proceedings of the 2014 SIAM International Conference on Data Mining. SIAM, 866–874.
- Noël Malod-Dognin and Nataša Pržulj. 2015. L-GRAAL: Lagrangian graphlet-based network aligner. Bioinformatics 31, 13 (2015), 2182–2189.
- Hazel N. Manners, Ahed Elmsallati, Pietro H. Guzzi, Swarup Roy, and Jugal K. Kalita. 2017. Performing local network alignment by ensembling global aligners. In Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM’17). IEEE, 1316–1323.
- Farzan Masrour, Iman Barjesteh, Rana Forsati, Abdol-Hossein Esfahanian, and Hayder Radha. 2015. Network completion with node similarity: A matrix completion approach with provable guarantees. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’15). IEEE, 302– 307.
- Aditya Krishna Menon and Charles Elkan. 2011. Link prediction via matrix factorization. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 437–452.
- Kurt Miller, Michael I. Jordan, and Thomas L. Griffiths. 2009. Nonparametric latent feature models for link prediction. In Proceedings of the Advances in Neural Information Processing Systems. 1276–1284.
- Marco Mina and Pietro Hiram Guzzi. 2014. Improving the robustness of local network alignment: Design and extensive assessment of a markov clustering-based approach. IEEE/ACM Transactions on Computational Biology and Bioinformatics 11, 3 (2014), 561–572.
- K. B. Petersen, M. S. Pedersen, and others. 2008. The matrix cookbook, vol 7. Technical University of Denmark 15 (2008).
- Benjamin Recht. 2011. A simpler approach to matrix completion. Journal of Machine Learning Research 12, Dec (2011), 3413–3430.
- Jasson D. M. Rennie and Nathan Srebro. 2005. Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the 22nd International Conference on Machine Learning. ACM, 713–719.
- Rohit Singh, Jinbo Xu, and Bonnie Berger. 2008. Global alignment of multiple protein interaction networks with application to functional orthology detection. Proceedings of the National Academy of Sciences 105, 35 (2008), 12763– 12768.
- Sucheta Soundarajan, Tina Eliassi-Rad, Brian Gallagher, and Ali Pinar. 2016. MaxReach: Reducing network incompleteness through node probes. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’16). IEEE, 152–157.
- Kim-Chuan Toh and Sangwoon Yun. 2010. An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Pacific Journal of Optimization 6, 615–640 (2010), 15.
- Rianne van den Berg, Thomas N. Kipf, and Max Welling. 2017. Graph convolutional matrix completion. arXiv preprint arXiv:1706.02263.
- Vipin Vijayan, Vikram Saraph, and T. Milenković. 2015. MAGNA++: Maximizing accuracy in global network alignment via both node and edge conservation. Bioinformatics 31, 14 (2015), 2409–2411.
- Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems 42, 1 (2015), 181–213.
- Reza Zafarani and Huan Liu. 2013. Connecting users across social media sites: A behavioral-modeling approach. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 41–49.
- Jiawei Zhang, Jianhui Chen, Shi Zhi, Yi Chang, S. Yu Philip, and Jiawei Han. 2017. Link prediction across aligned networks with sparse and low rank matrix estimation. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering (ICDE’17). IEEE, 971–982.
- Jiawei Zhang and S. Yu Philip. 2015. Multiple anonymized social networks alignment. In Proceedings of the 2015 IEEE International Conference onData Mining (ICDM’15). IEEE, 599–608.
- Si Zhang and Hanghang Tong. 2016. Final: Fast attributed network alignment. In Proceedings of the 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM.
- Si Zhang and Hanghang Tong. 2018. Attributed network alignment: Problem definitions and fast solutions. IEEE Transactions on Knowledge and Data Engineering 31, 9 (2018), 1680–1692.
- Si Zhang, Hanghang Tong, Jiejun Xu, Yifan Hu, and Ross Maciejewski. 2019. Origin: Non-rigid network alignment. In Proceedings of the 2019 IEEE International Conference on Big Data. IEEE.
- Yutao Zhang, Jie Tang, Zhilin Yang, Jian Pei, and Philip S Yu. 2015. Cosnet: Connecting heterogeneous social networks with local and global consistency. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1485–1494. Received May 2018; revised November 2019; accepted February 2020

Tags

Comments