AI helps you reading Science
AI Insight
AI extracts a summary of this paper
Weibo:
COSNET: Connecting Heterogeneous Social Networks with Local and Global Consistency
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,, pp.1485-1494, (2015)
EI
Abstract
More often than not, people are active in more than one social network. Identifying users from multiple heterogeneous social networks and integrating the different networks is a fundamental issue in many applications. The existing methods tackle this problem by estimating pairwise similarity between users in two networks. However, those m...More
Code:
Data:
Introduction
- The authors are facing an era of online with offline (OWO)—almost everyone is using online social networks to connect friends or, more generally, to satisfy social needs at different levels [25].
- Many users participate in more than one social network, such as public networks and private networks, as well as business networks and family networks.
- Preliminary statistics show that the average number of social networks in which a user participates is eight.
- The intentions behind these choices are sophisticated.
- Users generate heterogeneous content and build different ego-networks in different social networks.
- One interesting and important question is: can the authors automatically integrate the different heterogeneous social networks together?
Highlights
- We are facing an era of online with offline (OWO)—almost everyone is using online social networks to connect friends or, more generally, to satisfy social needs at different levels [25]
- We propose COSNET (COnnecting heterogeneous Social NETworks with local and global consistency), a novel energy-based model, to address this problem by considering both local and global consistency among multiple networks
- We demonstrate that applying the integration results produced by our method can improve the accuracy of expert finding, an important task in social networks
- The SNS data collection consists of five popular online social networking sites: Twitter (TW), LiveJournal (LJ), Flickr (FL), Last.fm (LA), and MySpace (MS)
- We study the problem of multiple social network integration
- We develop an efficient model learning algorithm based on dual decomposition which can be parallelized
Methods
- SVM: This method formalizes the matching problem as a classification problem
- It trains a classification model based on the labeled data, and applies the classification model to classify whether a candidate matching is correct or not.
- SiGMa [18]: This method was designed to align two knowledge bases, by propagating the confidence score in the matching graph.
Results
- The authors perform experiments on two data collections: SNS and Academia4.
- Each data collection consists of several social networks.
- In the different networks of each data collection, both the users and the meanings of the relationships in the different networks are very different.
- Table 3 lists statistics of the two data collections.
- SNS Network Collection.
- The SNS data collection consists of five popular online social networking sites: Twitter (TW), LiveJournal (LJ), Flickr (FL), Last.fm (LA), and MySpace (MS)
Conclusion
- The authors study the problem of multiple social network integration.
- The authors precisely define the problem, and propose a novel energy-based framework COSNET to address it.
- The authors develop an efficient model learning algorithm based on dual decomposition which can be parallelized.
- The authors' experimental results on two different genres of data sets validate the effectiveness and efficiency of the proposed framework.
- The authors further validate the effectiveness of the method by applying the integrated results to support expert finding, an important application.
- The authors thank Shlomo Berkovsky, Terence Chen, and Dali Kaafar for sharing the linked accounts data in this research
Tables
- Table1: Notations. DESCRIPTION
- Table2: Performance comparison of different methods for network integration task. The results are presented jointly and separately for each pair of networks
- Table3: Statistics of the two datasets. SNS consists of five networks and Academia consists of three networks
Related work
- We briefly review related literature from two aspects: connecting users across social networks, and entity linking. Connecting Users across Social Networks. An immediate method to connect users in different networks is to leverage usernames [38, 27, 39, 22]. Zafarani et al [38] were the first to address this problem. Their approach utilized prefix/postfix addition and removal to map usernames from a base community to a target community. Peritio et al [27] estimated the uniqueness of usernames with a Markov chain model. The recent work by Zafarani et al [39] conducted a more in-depth investigation of this problem. They defined sophisticated features, such as knowledge limitation and typing patterns, to model the behavior patterns of users in selecting usernames. Liu et al [22] leveraged rare usernames to create training instances for user identification. Liu et al [23] explored the social identity linkage problem based on user behavior modeling. They proposed a method to incorporate user attributes, user generated content, and social activities to link user accounts in different social networks.
Funding
- The work is supported by the National High-tech R&D Program (No 2014AA015103), National Basic Research Program of China (No 2014CB340506, No 2012CB316006), NSFC (No 61222212), NSFC-ANR (No 61261130588), National Social Science Foundation of China (No.13&ZD190), the Tsinghua University Initiative Scientific Research Program (20121088096), a research fund supported by Huawei Inc., and Beijing key lab of networked multimedia
Reference
- R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.
- K. M. Anstreicher and L. A. Wolsey. Two "well-known" properties of subgradient optimization. Mathematical Programming, 120(1):213–220, 2009.
- L. Backstrom, C. Dwork, and J. M. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW’07, pages 181–190, 2007.
- X. Bai, F. P. Junqueira, and S. H. Sengamedu. Exploiting user clicks for automatic seed set generation for entity matching. In KDD’13, pages 980–988, 2013.
- K. Bellare, S. Iyengar, A. G. Parameswaran, and V. Rastogi. Active sampling for entity matching. In KDD’12, pages 1131–1139, 2012.
- I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1(1):1–36, March 2007.
- C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR’2004, pages 25–32, 2004.
- W. Chen, Z. Liu, X. Sun, and Y. Wang. A game-theoretic framework to identify overlapping communities in social networks. Data Mining and Knowledge Discovery, 21(2):224–240, 2010.
- W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string metrics for matching names and records. In Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web, pages 73–78, 2003.
- S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL’07, volume 6, pages 708–716, 2007.
- Y. Cui, J. Pei, G. Tang, W.-S. Luk, D. Jiang, and M. Hua. Finding email correspondents in online social networks. World Wide Web, 16(2):195–218, 2013.
- R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. MIT Press, Cambridge, MA, 2000.
- S. Kataria, K. S. Kumar, R. Rastogi, P. Sen, and S. H. Sengamedu. Entity disambiguation with hierarchical topic models. In KDD’11, pages 1037–1045, 2011.
- N. Komodakis. Efficient training for pairwise or higher order crfs via dual decomposition. In CVPR’11, pages 1841–1848, 2011.
- N. Komodakis, N. Paragios, and G. Tziritas. Mrf energy minimization and beyond via dual decomposition. IEEE Trans. Pattern Anal. Mach. Intell., 2011.
- X. Kong, J. Zhang, and S. Y. Philip. Inferring anchor links across multiple heterogeneous social networks. In CIKM’13, pages 179–188, 2013.
- H. Kwak, C. Lee, H. Park, and S. B. Moon. What is twitter, a social network or a news media? In WWW’10, pages 591–600, 2010.
- S. Lacoste-Julien, K. Palla, A. Davies, G. Kasneci, T. Graepel, and Z. Ghahramani. Sigma: Simple greedy matching for aligning large knowledge bases. In KDD’13, pages 572–580, 2013.
- Y. LeCun, S. Chopra, and R. Hadsell. A tutorial on energy-based learning. 2006 CIAR Summer School: Neural Computation & Adaptive Perception, 2006.
- J. Li, J. Tang, Y. Li, and Q. Luo. Rimom: A dynamic multi-strategy ontology alignment framework. IEEE TKDE, 21(8):1218–1232, 2009.
- Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan. Mining evidences for named entity disambiguation. In KDD’13, pages 1070–1078, 2013.
- J. Liu, F. Zhang, X. Song, Y.-I. Song, C.-Y. Lin, and H.-W. Hon. What’s in a name?: an unsupervised approach to link users across communities. In WSDM’13, pages 495–504, 2013.
- S. Liu, S. Wang, F. Zhu, J. Zhang, and R. Krishnan. Hydra: Large-scale social identity linkage via heterogeneous behavior modeling. In SIGMOD’14, pages 51–62, 2014.
- H. Ma, H. Yang, M. R. Lyu, and I. King. Sorec: social recommendation using probabilistic matrix factorization. In CIKM’08, pages 931–940, 2008.
- A. Maslow. A theory of human motivation. Psychological Review, 50(4):370–396, 1943.
- A. Narayanan and V. Shmatikov. De-anonymizing social networks. In IEEE Symposium on Security and Privacy’09, pages 173–187, 2009.
- D. Perito, C. Castelluccia, M. A. Kaafar, and P. Manils. How unique and traceable are usernames? In Privacy Enhancing Technologies, pages 1–17, 2011.
- W. Shen, J. Wang, P. Luo, and M. Wang. Linking named entities in tweets with knowledge base via user interest modeling. In KDD’13, pages 68–76, 2013.
- S. Tan, Z. Guan, D. Cai, X. Qin, J. Bu, and C. Chen. Mapping users across networks by manifold alignment on hypergraph. In AAAI’14, pages 159–165, 2014.
- J. Tang, A. Fong, B. Wang, and J. Zhang. A unified probabilistic framework for name disambiguation in digital library. IEEE TKDE, 24(6):975–987, 2012.
- J. Tang, H. Gao, H. Liu, and A. D. Sarma. eTrust: Understanding trust evolution in an online world. In KDD’12, pages 253–261, 2012.
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD’08, pages 990–998, 2008.
- W. Tang, J. Tang, T. Lei, C. Tan, B. Gao, and T. Li. On optimization of expertise matching with various constraints. Neurocomputing, 76(1):71–83, 2012.
- B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. NIPS’04, 16, 2004.
- H. Whitney. Congruent graphs and the connectivity of graphs. American Journal of Mathematics, 54(1):150?168, 1932.
- S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. Who says what to whom on twitter. In WWW’11, pages 705–714, 2011.
- L. Yartseva and M. Grossglauser. On the performance of percolation graph matching. In COSN’13, pages 119–130, 2013.
- R. Zafarani and H. Liu. Connecting corresponding identities across communities. In ICWSM’09, pages 354–357, 2009.
- R. Zafarani and H. Liu. Connecting users across social media sites: A behavioral-modeling approach. In KDD’13, pages 41–49, 2013.
- J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In DASFAA’07, pages 1066–1069, 2007.
Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn