AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We study the problem of multiple social network integration

COSNET: Connecting Heterogeneous Social Networks with Local and Global Consistency

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,, pp.1485-1494, (2015)

Cited: 278|Views654
EI

Abstract

More often than not, people are active in more than one social network. Identifying users from multiple heterogeneous social networks and integrating the different networks is a fundamental issue in many applications. The existing methods tackle this problem by estimating pairwise similarity between users in two networks. However, those m...More

Code:

Data:

0
Introduction
  • The authors are facing an era of online with offline (OWO)—almost everyone is using online social networks to connect friends or, more generally, to satisfy social needs at different levels [25].
  • Many users participate in more than one social network, such as public networks and private networks, as well as business networks and family networks.
  • Preliminary statistics show that the average number of social networks in which a user participates is eight.
  • The intentions behind these choices are sophisticated.
  • Users generate heterogeneous content and build different ego-networks in different social networks.
  • One interesting and important question is: can the authors automatically integrate the different heterogeneous social networks together?
Highlights
  • We are facing an era of online with offline (OWO)—almost everyone is using online social networks to connect friends or, more generally, to satisfy social needs at different levels [25]
  • We propose COSNET (COnnecting heterogeneous Social NETworks with local and global consistency), a novel energy-based model, to address this problem by considering both local and global consistency among multiple networks
  • We demonstrate that applying the integration results produced by our method can improve the accuracy of expert finding, an important task in social networks
  • The SNS data collection consists of five popular online social networking sites: Twitter (TW), LiveJournal (LJ), Flickr (FL), Last.fm (LA), and MySpace (MS)
  • We study the problem of multiple social network integration
  • We develop an efficient model learning algorithm based on dual decomposition which can be parallelized
Methods
  • SVM: This method formalizes the matching problem as a classification problem
  • It trains a classification model based on the labeled data, and applies the classification model to classify whether a candidate matching is correct or not.
  • SiGMa [18]: This method was designed to align two knowledge bases, by propagating the confidence score in the matching graph.
Results
  • The authors perform experiments on two data collections: SNS and Academia4.
  • Each data collection consists of several social networks.
  • In the different networks of each data collection, both the users and the meanings of the relationships in the different networks are very different.
  • Table 3 lists statistics of the two data collections.
  • SNS Network Collection.
  • The SNS data collection consists of five popular online social networking sites: Twitter (TW), LiveJournal (LJ), Flickr (FL), Last.fm (LA), and MySpace (MS)
Conclusion
  • The authors study the problem of multiple social network integration.
  • The authors precisely define the problem, and propose a novel energy-based framework COSNET to address it.
  • The authors develop an efficient model learning algorithm based on dual decomposition which can be parallelized.
  • The authors' experimental results on two different genres of data sets validate the effectiveness and efficiency of the proposed framework.
  • The authors further validate the effectiveness of the method by applying the integrated results to support expert finding, an important application.
  • The authors thank Shlomo Berkovsky, Terence Chen, and Dali Kaafar for sharing the linked accounts data in this research
Tables
  • Table1: Notations. DESCRIPTION
  • Table2: Performance comparison of different methods for network integration task. The results are presented jointly and separately for each pair of networks
  • Table3: Statistics of the two datasets. SNS consists of five networks and Academia consists of three networks
Download tables as Excel
Related work
  • We briefly review related literature from two aspects: connecting users across social networks, and entity linking. Connecting Users across Social Networks. An immediate method to connect users in different networks is to leverage usernames [38, 27, 39, 22]. Zafarani et al [38] were the first to address this problem. Their approach utilized prefix/postfix addition and removal to map usernames from a base community to a target community. Peritio et al [27] estimated the uniqueness of usernames with a Markov chain model. The recent work by Zafarani et al [39] conducted a more in-depth investigation of this problem. They defined sophisticated features, such as knowledge limitation and typing patterns, to model the behavior patterns of users in selecting usernames. Liu et al [22] leveraged rare usernames to create training instances for user identification. Liu et al [23] explored the social identity linkage problem based on user behavior modeling. They proposed a method to incorporate user attributes, user generated content, and social activities to link user accounts in different social networks.
Funding
  • The work is supported by the National High-tech R&D Program (No 2014AA015103), National Basic Research Program of China (No 2014CB340506, No 2012CB316006), NSFC (No 61222212), NSFC-ANR (No 61261130588), National Social Science Foundation of China (No.13&ZD190), the Tsinghua University Initiative Scientific Research Program (20121088096), a research fund supported by Huawei Inc., and Beijing key lab of networked multimedia
Reference
  • R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.
    Google ScholarFindings
  • K. M. Anstreicher and L. A. Wolsey. Two "well-known" properties of subgradient optimization. Mathematical Programming, 120(1):213–220, 2009.
    Google ScholarLocate open access versionFindings
  • L. Backstrom, C. Dwork, and J. M. Kleinberg. Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In WWW’07, pages 181–190, 2007.
    Google ScholarLocate open access versionFindings
  • X. Bai, F. P. Junqueira, and S. H. Sengamedu. Exploiting user clicks for automatic seed set generation for entity matching. In KDD’13, pages 980–988, 2013.
    Google ScholarLocate open access versionFindings
  • K. Bellare, S. Iyengar, A. G. Parameswaran, and V. Rastogi. Active sampling for entity matching. In KDD’12, pages 1131–1139, 2012.
    Google ScholarLocate open access versionFindings
  • I. Bhattacharya and L. Getoor. Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1(1):1–36, March 2007.
    Google ScholarLocate open access versionFindings
  • C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR’2004, pages 25–32, 2004.
    Google ScholarLocate open access versionFindings
  • W. Chen, Z. Liu, X. Sun, and Y. Wang. A game-theoretic framework to identify overlapping communities in social networks. Data Mining and Knowledge Discovery, 21(2):224–240, 2010.
    Google ScholarLocate open access versionFindings
  • W. W. Cohen, P. Ravikumar, and S. E. Fienberg. A comparison of string metrics for matching names and records. In Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web, pages 73–78, 2003.
    Google ScholarLocate open access versionFindings
  • S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL’07, volume 6, pages 708–716, 2007.
    Google ScholarLocate open access versionFindings
  • Y. Cui, J. Pei, G. Tang, W.-S. Luk, D. Jiang, and M. Hua. Finding email correspondents in online social networks. World Wide Web, 16(2):195–218, 2013.
    Google ScholarLocate open access versionFindings
  • R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. MIT Press, Cambridge, MA, 2000.
    Google ScholarFindings
  • S. Kataria, K. S. Kumar, R. Rastogi, P. Sen, and S. H. Sengamedu. Entity disambiguation with hierarchical topic models. In KDD’11, pages 1037–1045, 2011.
    Google ScholarLocate open access versionFindings
  • N. Komodakis. Efficient training for pairwise or higher order crfs via dual decomposition. In CVPR’11, pages 1841–1848, 2011.
    Google ScholarLocate open access versionFindings
  • N. Komodakis, N. Paragios, and G. Tziritas. Mrf energy minimization and beyond via dual decomposition. IEEE Trans. Pattern Anal. Mach. Intell., 2011.
    Google ScholarLocate open access versionFindings
  • X. Kong, J. Zhang, and S. Y. Philip. Inferring anchor links across multiple heterogeneous social networks. In CIKM’13, pages 179–188, 2013.
    Google ScholarLocate open access versionFindings
  • H. Kwak, C. Lee, H. Park, and S. B. Moon. What is twitter, a social network or a news media? In WWW’10, pages 591–600, 2010.
    Google ScholarLocate open access versionFindings
  • S. Lacoste-Julien, K. Palla, A. Davies, G. Kasneci, T. Graepel, and Z. Ghahramani. Sigma: Simple greedy matching for aligning large knowledge bases. In KDD’13, pages 572–580, 2013.
    Google ScholarLocate open access versionFindings
  • Y. LeCun, S. Chopra, and R. Hadsell. A tutorial on energy-based learning. 2006 CIAR Summer School: Neural Computation & Adaptive Perception, 2006.
    Google ScholarLocate open access versionFindings
  • J. Li, J. Tang, Y. Li, and Q. Luo. Rimom: A dynamic multi-strategy ontology alignment framework. IEEE TKDE, 21(8):1218–1232, 2009.
    Google ScholarLocate open access versionFindings
  • Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan. Mining evidences for named entity disambiguation. In KDD’13, pages 1070–1078, 2013.
    Google ScholarLocate open access versionFindings
  • J. Liu, F. Zhang, X. Song, Y.-I. Song, C.-Y. Lin, and H.-W. Hon. What’s in a name?: an unsupervised approach to link users across communities. In WSDM’13, pages 495–504, 2013.
    Google ScholarLocate open access versionFindings
  • S. Liu, S. Wang, F. Zhu, J. Zhang, and R. Krishnan. Hydra: Large-scale social identity linkage via heterogeneous behavior modeling. In SIGMOD’14, pages 51–62, 2014.
    Google ScholarLocate open access versionFindings
  • H. Ma, H. Yang, M. R. Lyu, and I. King. Sorec: social recommendation using probabilistic matrix factorization. In CIKM’08, pages 931–940, 2008.
    Google ScholarLocate open access versionFindings
  • A. Maslow. A theory of human motivation. Psychological Review, 50(4):370–396, 1943.
    Google ScholarLocate open access versionFindings
  • A. Narayanan and V. Shmatikov. De-anonymizing social networks. In IEEE Symposium on Security and Privacy’09, pages 173–187, 2009.
    Google ScholarLocate open access versionFindings
  • D. Perito, C. Castelluccia, M. A. Kaafar, and P. Manils. How unique and traceable are usernames? In Privacy Enhancing Technologies, pages 1–17, 2011.
    Google ScholarLocate open access versionFindings
  • W. Shen, J. Wang, P. Luo, and M. Wang. Linking named entities in tweets with knowledge base via user interest modeling. In KDD’13, pages 68–76, 2013.
    Google ScholarLocate open access versionFindings
  • S. Tan, Z. Guan, D. Cai, X. Qin, J. Bu, and C. Chen. Mapping users across networks by manifold alignment on hypergraph. In AAAI’14, pages 159–165, 2014.
    Google ScholarLocate open access versionFindings
  • J. Tang, A. Fong, B. Wang, and J. Zhang. A unified probabilistic framework for name disambiguation in digital library. IEEE TKDE, 24(6):975–987, 2012.
    Google ScholarLocate open access versionFindings
  • J. Tang, H. Gao, H. Liu, and A. D. Sarma. eTrust: Understanding trust evolution in an online world. In KDD’12, pages 253–261, 2012.
    Google ScholarLocate open access versionFindings
  • J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD’08, pages 990–998, 2008.
    Google ScholarLocate open access versionFindings
  • W. Tang, J. Tang, T. Lei, C. Tan, B. Gao, and T. Li. On optimization of expertise matching with various constraints. Neurocomputing, 76(1):71–83, 2012.
    Google ScholarLocate open access versionFindings
  • B. Taskar, C. Guestrin, and D. Koller. Max-margin markov networks. NIPS’04, 16, 2004.
    Google ScholarFindings
  • H. Whitney. Congruent graphs and the connectivity of graphs. American Journal of Mathematics, 54(1):150?168, 1932.
    Google ScholarLocate open access versionFindings
  • S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. Who says what to whom on twitter. In WWW’11, pages 705–714, 2011.
    Google ScholarLocate open access versionFindings
  • L. Yartseva and M. Grossglauser. On the performance of percolation graph matching. In COSN’13, pages 119–130, 2013.
    Google ScholarLocate open access versionFindings
  • R. Zafarani and H. Liu. Connecting corresponding identities across communities. In ICWSM’09, pages 354–357, 2009.
    Google ScholarLocate open access versionFindings
  • R. Zafarani and H. Liu. Connecting users across social media sites: A behavioral-modeling approach. In KDD’13, pages 41–49, 2013.
    Google ScholarLocate open access versionFindings
  • J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In DASFAA’07, pages 1066–1069, 2007.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn