Open data challenges at Facebook

Data Engineering(2015)

引用 23|浏览17
暂无评分
摘要
At Facebook, our data systems process huge volumes of data, ranging from hundreds of terabytes in memory to hundreds of petabytes on disk. We categorize our systems as “small data” or “big data” based on the type of queries they run. Small data refers to OLTP-like queries that process and retrieve a small amount of data, for example, the 1000s of objects necessary to render Facebook's personalized News Feed for each person. These objects are requested by their ids; indexes limit the amount of data accessed during a single query, regardless of the total volume of data. Big data refers to queries that process large amounts of data, usually for analysis: trouble-shooting, identifying trends, and making decisions. Big data stores are the workhorses for data analysis at Facebook. They grow by millions of events (inserts) per second and process tens of petabytes and hundreds of thousands of queries per day. In this tutorial, we will describe our data systems and the current challenges we face. We will lead a discussion on these challenges, approaches to solve them, and potential pitfalls. We hope to stimulate interest in solving these problems in the research community.
更多
查看译文
关键词
big data,data analysis,query processing,social networking (online),facebook,oltp-like queries,personalized news feed,small data,databases,reliability,time series analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要