A tweet-centric approach for topic-specific author ranking in micro-blog

    ADMA, pp. 138-151, 2011.

    Cited by: 14|Bibtex|Views25|Links
    WOS SCOPUS EI
    Keywords:
    topic-specific high-quality contenttweet-centric approachranking accuracytweet-centric topic-specific author rankingauthor ranking approachMore(8+)
    Wei bo:
    We propose a tweet-centric approach for topic-specific author ranking in micro-blog, which is based on a directed weighted user-tweet graph

    Abstract:

    Most users play two roles in micro-blog, namely, author and reader of tweets. Facing diverse users and mass user-generated contents in micro-blog, identifying and ranking influential authors who post topic-specific high-quality contents is a challenge. In this paper, we present a way to measure the quality of tweets, which accordingly det...More

    Code:

    Data:

    0
    Introduction
    • 1.1 Motivation

      Micro-blog, a novel social network service, has spread rapidly over the world and been accepted by hundreds of millions of people.
    • People can follow any other users whom they are interested in, and share or forward their contents without their permission.
    • Such contents sharing or forwarding activity is called retweeting, J.
    • Since microblog is much more open than some other social network services like facebook, the speed of information diffusion among micro-blog users is incredible, and users’ interaction becomes unprecedentedly easy and unimpeded
    Highlights
    • 1.1 Motivation

      Micro-blog, a novel social network service, has spread rapidly over the world and been accepted by hundreds of millions of people
    • The user-tweet graph used in this algorithm contains three relationships, following relationships between users, parent-child relationships between tweets, and posting-posted relationships between users and tweets. Different from these algorithms, we focus on topic-specific author ranking and our approach is based on a user-tweet graph according to retweeting relationships from readers to tweets and belonging relationships from tweets to authors, which emphasizes real interaction between users and tweets
    • We propose a tweet-centric approach for topic-specific author ranking in micro-blog, which is based on a directed weighted user-tweet graph
    • The influence of an author is determined by the quality of tweets s/he has posted, which is in turn determined by the retweeting behaviors of their readers
    • Implementation of the topic-specific author ranking algorithm is based on MapReduce framework, feasible to mass data processing and timely author ranking
    • Applying the topic-specific author ranking to personalized tweet and author recommendation and tracking is an interesting topic to be explored in the future work
    Methods
    • The authors' evaluation experiments were carried out on a hadoop cluster with 60 nodes, each with one 2.13GHz Intel Xeon Quad-Core processor, 32GB memory, and 12*146GB disks.
    • Data set is from Tencent Micro-blog, which has more than 200 million users.
    • The authors collected the information of all the tweets in the period of March 19th, 2011 to June 19th, 2011, a complete dataset of four months.
    • The tweets are stored by date.
    • The data file per day has approximately 4GB, and is placed on HDFS
    Results
    • The most unstable topic is “iPhone”, whose pcu is less than 10% throughout the seven days. “DNF” is different from the other three topics.
    Conclusion
    • The authors propose a tweet-centric approach for topic-specific author ranking in micro-blog, which is based on a directed weighted user-tweet graph.
    • The authors plan to combine this approach with event tracking and management in micro-blog, which could be convenient for reviewing historical events in the future.
    • Applying the topic-specific author ranking to personalized tweet and author recommendation and tracking is an interesting topic to be explored in the future work
    Summary
    • Introduction:

      1.1 Motivation

      Micro-blog, a novel social network service, has spread rapidly over the world and been accepted by hundreds of millions of people.
    • People can follow any other users whom they are interested in, and share or forward their contents without their permission.
    • Such contents sharing or forwarding activity is called retweeting, J.
    • Since microblog is much more open than some other social network services like facebook, the speed of information diffusion among micro-blog users is incredible, and users’ interaction becomes unprecedentedly easy and unimpeded
    • Methods:

      The authors' evaluation experiments were carried out on a hadoop cluster with 60 nodes, each with one 2.13GHz Intel Xeon Quad-Core processor, 32GB memory, and 12*146GB disks.
    • Data set is from Tencent Micro-blog, which has more than 200 million users.
    • The authors collected the information of all the tweets in the period of March 19th, 2011 to June 19th, 2011, a complete dataset of four months.
    • The tweets are stored by date.
    • The data file per day has approximately 4GB, and is placed on HDFS
    • Results:

      The most unstable topic is “iPhone”, whose pcu is less than 10% throughout the seven days. “DNF” is different from the other three topics.
    • Conclusion:

      The authors propose a tweet-centric approach for topic-specific author ranking in micro-blog, which is based on a directed weighted user-tweet graph.
    • The authors plan to combine this approach with event tracking and management in micro-blog, which could be convenient for reviewing historical events in the future.
    • Applying the topic-specific author ranking to personalized tweet and author recommendation and tracking is an interesting topic to be explored in the future work
    Tables
    • Table1: Top 10 influential authors on three hot topics
    • Table2: Daily author ranking on the topic of “Libya Crisis”
    • Table3: Effect of W eight(eu→t) and W eight(et→u)
    Download tables as Excel
    Related work
    • Some work on user ranking in micro-blog came out in the last few years. TunkRank [7] and TwitterRank [9], which are both variants of PageRank [4], measured the influence of users in Twitter based on a user graph constructed according to following relationships. The difference between them is that TwitterRank introduces topic similarity between a user and his/her followers. IPInfluence [6] is a related algorithm similar to HITS [3], considering passivity of users, a measure of how different it is for other users to influence them. Another interesting approach leveraged probabilistic clustering and Gaussian-based ranking based on user features analysis [5].

      TURank [10] measured the influence of users using ObjectRank [1] based on a user-tweet graph, which is most similar to our approach. The user-tweet graph used in this algorithm contains three relationships, following relationships between users, parent-child relationships between tweets, and posting-posted relationships between users and tweets. Different from these algorithms, we focus on topic-specific author ranking and our approach is based on a user-tweet graph according to retweeting relationships from readers to tweets and belonging relationships from tweets to authors, which emphasizes real interaction between users and tweets.
    Funding
    • The work is funded by National Natural Science Foundation of China (60773156, 61073004), Chinese Major State Basic Research Development 973 Program (2011CB302203-2), and Tencent Research Fund
    Reference
    • Balmin, A., Hristidis, V., Papakonstantinou, Y.: Objectrank: Authority-based keyword search in databases. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol. 30, pp. 564–575. VLDB Endowment (2004)
      Google ScholarLocate open access versionFindings
    • Chen, C., Li, F., Ooi, B.C., Wu, S.: Ti: An efficient indexing mechanism for realtime search on tweets. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data. ACM (2011)
      Google ScholarLocate open access versionFindings
    • Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46(5), 604–632 (1999)
      Google ScholarLocate open access versionFindings
    • Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Stanford Digital Library (1999)
      Google ScholarFindings
    • Pal, A., Counts, S.: Identifying topical authorities in microblogs. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 45–54. ACM (2011)
      Google ScholarLocate open access versionFindings
    • Romero, D.M., Galuba, W., Asur, S., Huberman, B.A.: Influence and passivity in social media. In: Proceedings of the 20th International Conference on World Wide Web, pp. 113–114. ACM (2011)
      Google ScholarLocate open access versionFindings
    • Tunkelang, D.: A twitter analog to pagerank (2009), http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/
      Findings
    • Welch, M.J., Schonfeld, U., He, D., Cho, J.: Topical semantics of twitter links. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 327–336. ACM (2011)
      Google ScholarLocate open access versionFindings
    • Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 261–270. ACM (2010)
      Google ScholarLocate open access versionFindings
    • Yamaguchi, Y., Takahashi, T., Amagasa, T., Kitagawa, H.: TURank: Twitter User Ranking Based on User-Tweet Graph Analysis. In: Chen, L., Triantafillou, P., Suel, T. (eds.) WISE 2010. LNCS, vol. 6488, pp. 240–253. Springer, Heidelberg (2010)
      Google ScholarLocate open access versionFindings
    Your rating :
    0

     

    Tags
    Comments