Using Variability as a Guiding Principle to Reduce Latency in Web Applications via OS Profiling

WWW '19: The Web Conference on The World Wide Web Conference WWW 2019(2019)

引用 11|浏览43
暂无评分
摘要
Request latency is a critical metric in determining the usability of web services. The latency of a request includes service time - the time when the request is being actively serviced - and waiting time - the time when the request is waiting to be served. Most existing works aim to reduce request latency by focusing on reducing the mean service time (that is, shortening the critical path). In this paper, we explore an alternative approach to reducing latency - using variability as a guiding principle when designing web services. By tracking the service time variability of the request as it traverses across software layers within the user and kernel space of the web server, we identify the most critical stages of request processing. We then determine control knobs in the OS and application, such as thread scheduling and request batching, that regulate the variability in these stages, and demonstrate that tuning these specific knobs can significantly improve end-to-end request latency. Our experimental results with Memcached and Apache web server under different request rates, including real-world traces, show that this alternative approach can reduce mean and tail latency by 30-50%.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要