Supporting Collaboration in the Era of Internet-Scale Data

msra(2009)

引用 23|浏览11
暂无评分
摘要
Reproducibility is an often-cited and valid concern of the research being performed by many corporate research programs, such as the one I work with at Facebook. We consistently run experiments on millions of active users using proprietary systems, gather results on data infrastructure at massive scale, and produce reports which distill this process into a few lines of context. It is no wonder that many papers from internet research labs are returned with comments to how the results are interesting but entirely irreproducible. This effect is just one symptom of the growing gap between the instruments available to researchers studying social computing, human-computer interaction, recommender systems, and auction theory, among others. On one side of this divide are academics, depending on shared data sets and infrastructure to enable the collective advancement of science, cooperating with Institutional Review Boards and beholden to funding agencies. On the other side are industrial researchers, utilizing proprietary data and infrastructure to driving science forward, maintaining privacy and Terms of Service (TOS), and beholden the goals of the corporation. It is hard to say how wide this gap is, but clear that the computational power of the likes of Google and Facebook continue to grow. When asked to produce a position paper on challenges in studying technology-mediated participation, I thought naturally to address the questions I most often get from academic researchers: can I have some data? Can I crawl the users on my university network? Perhaps run a query on your databases? At the same time, papers are regularly published which violate Facebook’s TOS, expose users privacy, and without any regulation by ethics review boards. In this paper I hope to describe what Facebook has to offer, expose some of the challenges we currently face in engaging with academia, and propose some possible solutions which allow for direct collaboration while upholding all legal and ethical guidelines.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要