Extracting skill endorsements from personal communication data
ACM International Conference on Information and Knowledge Management(2016)
摘要
People are increasingly communicating and collaborating via digital platforms, such as email and messaging applications. Data exchanged on these digital communication platforms can be a treasure trove of information on people who participate in the discussions: who they are collaborating with, what they are working on, what their expertise is, and so on. Yet, personal communication data is very rarely analyzed due to the sensitivity of the information it contains.
In this paper, we mine personal communication data with the goal of generating skill endorsements of the type “person A endorses person B on skill X.” To address privacy concerns, we consider that each person has access only to their own data (i.e., conversations with their peers). By using our method, they can generate endorsements for their peers, which they can inspect and opt to publish.
To identify meaningful skills we use a knowledge base created from the StackExchange Q&A forum, where we assume that tags represent the skills to be endorsed. We study two different approaches, one based on building a skill graph, and one based on information retrieval techniques. We find that the latter approach outperforms the graph-based algorithms when tested on a dataset of user profiles from StackOverflow. We also conduct a user study on email data from nine volunteers, and we find that the information retrieval-based approach achieves a MAP@10 score of 0.617.
更多查看译文
关键词
personal data,email mining,expertise profiling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络