Mars: Real-time spatio-temporal queries on microblogs

Data Engineering(2014)

引用 18|浏览44
暂无评分
摘要
Mars demonstration exploits the microblogs location information to support a wide variety of important spatio-temporal queries on microblogs. Supported queries include range, nearest-neighbor, and aggregate queries. Mars works under a challenging environment where streams of microblogs are arriving with high arrival rates. Mars distinguishes itself with three novel contributions: (1) Efficient in-memory digestion/expiration techniques that can handle microblogs of high arrival rates up to 64,000 microblog/sec. This also includes highly accurate and efficient hopping-window based aggregation for incoming microblogs keywords. (2) Smart memory optimization and load shedding techniques that adjust in-memory contents based on the expected query load to trade off a significant storage savings with a slight and bounded accuracy loss. (3) Scalable real-time query processing, exploiting Zipf distributed microblogs data for efficient top-k aggregate query processing. In addition, Mars employs a scalable real-time nearest neighbor and range query processing module that employs various pruning techniques so that it serves heavy query workloads in real time. Mars is demonstrated using a stream of real tweets obtained from Twitter firehose with a production query workload obtained from Bing web search. We show that Mars serves incoming queries with an average latency of less than 4 msec and with 99% answer accuracy while saving up to 70% of storage overhead for different query loads.
更多
查看译文
关键词
Internet,query processing,social networking (online),Bing Web search,Mars,Twitter firehose,Zipf distributed microblogs data,aggregate queries,heavy query workloads,hopping-window based aggregation,in-memory digestion-expiration techniques,load shedding techniques,microblogs keywords,microblogs location information,nearest-neighbor queries,production query workload,pruning techniques,range queries,real-time query processing,real-time spatio-temporal queries,smart memory optimization,top-k aggregate query processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要