Soft timers: efficient microsecond software timer support for network processing

ACM Transactions on Computer Systems (TOCS), no. 3 (2000): 197-228

Cited by: 111|Views159
EI

Abstract

This paper proposes and evaluates soft timers, a new operating system facility that allows the efficient scheduling of software events at a granularity down to tens of microseconds. Soft timers can be used to avoid interrupts and reduce context switches associated with network processing without sacrificing low communication delays.More s...More

Code:

Data:

Introduction
  • The authors propose and evaluate soft timers, an operating system facility that allows efficient scheduling of software events at microsecond granularity.

    The key idea behind soft timers is to take advantage of certain states in the execution of a system where an event handler can be invoked at low cost.
  • Such states include the entry points of the various OS kernel handlers, which are executed in response to system calls, exceptions (TLB miss, page fault, arithmetic) and hardware interrupts
  • In these “trigger states”, the cost of saving and restoring of CPU state and the shift in memory access locality associated with the switch to kernel mode have already been incurred; invoking.
  • Disk interrupts, conventional timer interrupts used for time-slicing and the associated context switches typically occur at intervals on the order of tens of msecs
Highlights
  • We propose and evaluate soft timers, an operating system facility that allows efficient scheduling of software events at microsecond granularity.

    The key idea behind soft timers is to take advantage of certain states in the execution of a system where an event handler can be invoked at low cost
  • Results presented in Section 5 show that a 300Mhz Pentium II system running a variety of workloads reaches trigger states frequently enough to allow the scheduling of softtimer events at a granularity of tens of secs
  • 1⁄2In some architectures (e.g., Pentium), TLB misses are handled in hardware; in these machines, TLB faults cannot be used as trigger states
  • Our experiments show that a Web server that employs rate-based clocking using soft timers can achieve up to 89% lower response time than a server with a conventional TCP over networks with high bandwidth-delay product
  • The saving and restoring of CPU state normally required upon a hardware timer interrupt is not necessary, and the cache/TLB pollution caused by the event handler is likely to have low impact on the system performance
Methods
  • Design of the soft timers facility

    the authors present the design of soft-timers, a mechanism for scheduling fine-grained events in an operating system with low overhead.

    Conventional timer facilities schedule events by invoking a designated handler periodically in the context of an external hardware interrupt.
  • The authors present the design of soft-timers, a mechanism for scheduling fine-grained events in an operating system with low overhead.
  • Conventional timer facilities schedule events by invoking a designated handler periodically in the context of an external hardware interrupt.
  • An Intel 8253 programmable interrupt timer chip is usually supplied with a Pentium-based CPU.
  • The former can be programmed to interrupt the processor at a given frequency.
  • The data and instructions touched by the interrupt handler are unrelated to the interrupted entity, which can adversely affect cache and TLB locality
Results
  • Results presented in Section

    5 show that a 300Mhz Pentium II system running a variety of workloads reaches trigger states frequently enough to allow the scheduling of softtimer events at a granularity of tens of secs.

    1⁄2In some architectures (e.g., Pentium), TLB misses are handled in hardware; in these machines, TLB faults cannot be used as trigger states.

    3⁄4A modified form of timing wheels [24] is used to maintain scheduled soft timer events.

    event fires event scheduled

    Example of minimum Event Time

    event scheduled interrupt clock tick event fires measuring clock tick Time

    Example of maximum Event Time

    The soft-timer facility provides the following operations.

    ̄ measure resolution().
  • ̄ measure time() returns a 64-bit value representing the current real time in ticks of a clock whose resolution is given by measure resolution()
  • Since this operation is intended to measure time intervals, the time need not be synchronized with any standard time base.
  • The authors quantify the overhead of the proposed soft timer facility and compare it to the alternative approach of scheduling events using hardware timer interrupts.
  • The authors present measurements that show the distribution of delays in soft timer event handling, given a variety of system workloads.
  • The number of simultaneous requests to the Web server were set such that the server machine was saturated
Conclusion
  • Soft timers allow the efficient scheduling of events at a granularity below that which can be provided by a conventional interval timer with acceptable overhead.
  • Soft timers should be used for events that require a granularity up to the trigger state interval, provided these events can tolerate probabilistic delays up to the granularity of the conventional interval timer.This paper proposes a novel operating system timer facility that allows the system to efficiently schedule events at a granularity down to tens of microseconds.
  • The saving and restoring of CPU state normally required upon a hardware timer interrupt is not necessary, and the cache/TLB pollution caused by the event handler is likely to have low impact on the system performance
Tables
  • Table1: Trigger state interval distribution
  • Table2: Trigger state sources
  • Table3: Overhead of rate-based clocking mission of a packet in the original TCP implementation. The results indicate that the effect of cache pollution with hardware timers is at least 4% (3⁄4 3⁄43⁄4 3⁄4) and 8% (¿ 3⁄43⁄4 ) worse than with soft timers for the Apache and the Flash server, respectively. The fact that Flash appears to be more affected by the cache pollution can be explained as follows. Apache is a multi-process server whose frequent context switching leads to relatively poor memory access locality. Flash, on the other hand, is a small, single-process event-driven server with presumably relatively good cache locality. It is intuitive, therefore, that the Flash server’s performance is more significantly affected by the cache pollution resulting from the timer interrupts
  • Table4: Rate-based clocking (target transmission interval = 40 secs)
  • Table5: Rate-based clocking (target transmission interval = 60 secs)
  • Table6: Rate-based clocking network performance (Bandwidth = 50Mbps, RTT = 100 msecs)
  • Table7: Rate-based clocking network performance (Bandwidth = 100Mbps, RTT = 100 msecs)
  • Table8: Network polling: throughput on 6KB HTTP requests
Download tables as Excel
Related work
  • The implementation of soft timers is based on the idea of polling, which goes back to the earliest days of computing. In polling, a main-line program periodically checks for asynchronous events, and invokes handler code for the event if needed.

    The novel idea in soft timers is to implement an efficient timer facility by making the operating system “poll” for pending soft timer events in certain strategic states. These “trigger states” are known to be reached very frequently during execution. Furthermore, these states are associated with a shift in memory access locality, thus allowing the interposition of handler code with little impact on system performance. The resulting facility can then be used to schedule events at a granularity that could not be efficiently achieved with a conventional hardware timer facility.
Funding
  • This work was supported in part by NSF Grant CCR-9803673, by Texas TATP Grant 003604, by an IBM Partnership Award, and by equipment donations from Compaq Western Research Lab and from HP Labs
Reference
  • M. Allman, C. Hayes, H. Kruse, and S. Ostermann. TCP Performance over Satellite Links. In Proceedings of 5th International Conference on Telecommunication Systems, pages 456–469, Nashville, TN, Mar. 1997.
    Google ScholarLocate open access versionFindings
  • M. Allman and V. Paxson. On estimating end-to-end network path properties. In Proceedings of the SIGCOMM ’99 Conference, pages 263–274, Cambridge, MA, Sept. 1999.
    Google ScholarLocate open access versionFindings
  • Apache. http://www.apache.org/.
    Findings
  • M. F. Arlitt and C. L. Williamson. Web Server Workload Characterization: The Search for Invariants. In Proceedings of the ACM SIGMETRICS ’96 Conference, pages 126–137, Philadelphia, PA, Apr. 1996.
    Google ScholarLocate open access versionFindings
  • H. Balakrishnan, V. N. Padmanabhan, and R. H. Katz. The Effects of Asymmetry on TCP Performance. In Proceedings of 3rd ACM Conference on Mobile Computing and Networking, pages 77–89, Budapest, Hungary, Sept. 1997.
    Google ScholarLocate open access versionFindings
  • H. Balakrishnan, V. N. Padmanabhan, S. Seshan, M. Stemm, and R. H. Katz. TCP behavior of a busy internet server: Analysis and improvements. In Proceedings of IEEE INFOCOM ’98, pages 252–262, San Francisco, CA, Apr. 1998.
    Google ScholarLocate open access versionFindings
  • T. Berners-Lee, R. Fielding, and H. Frystyk. RFC 1945: Hypertext transfer protocol – HTTP/1.0, May 1996. ftp://ftp.merit.edu/documents/rfc/rfc1945.txt.
    Google ScholarFindings
  • L. Brakmo and L. Peterson. Performance Problems in 4.4BSD TCP. ACM Computer Communication Review, 25(5):69–86, Oct. 1995.
    Google ScholarLocate open access versionFindings
  • L. Brakmo and L. Peterson. TCP Vegas: End to End Congestion Avoidance on a Global Internet. IEEE Journal on Selected Areas in Communications, 13(8):1465–1480, Oct. 1995.
    Google ScholarLocate open access versionFindings
  • W. c. Feng, D. D. Kandlur, D. Saha, and K. G. Shin. Understanding and improving TCP performance over networks with minimum rate guarantees. IEEE/ACM Transactions on Networking, 7(2):173–187, Apr. 1999.
    Google ScholarLocate open access versionFindings
  • K. Fall and S. Floyd. Simulation-based Comparisons of Tahoe, Reno, and SACK TCP. Computer Communication Review, 26(3):5–21, July 1996.
    Google ScholarLocate open access versionFindings
  • R. Fielding, J. Gettys, J. Mogul, H. Nielsen, and T. Berners-Lee. RFC 2068: Hypertext transfer protocol – HTTP/1.1, Jan. 1997.
    Google ScholarFindings
  • J. C. Hoe. Improving the Start-up Behaviour of a Congestion Control Scheme for TCP. In Proceedings of the ACM SIGCOMM ’96 Symposium, pages 270–280, Stanford, CA, Sept. 1996.
    Google ScholarLocate open access versionFindings
  • S. Keshav. A Control-Theoretic Approach to Flow Control. In Proceedings of the ACM SIGCOMM ’91 Symposium, pages 3–15, Zurich, Switzerland, Sept. 1991.
    Google ScholarLocate open access versionFindings
  • J. C. Mogul. Observing TCP Dynamics in Real Networks. In Proceedings of the ACM SIGCOMM ’92 Symposium, pages 281–292, Baltimore, MD, Aug. 1993.
    Google ScholarLocate open access versionFindings
  • J. C. Mogul. The Case for Persistent-Connection HTTP. In Proceedings of the ACM SIGCOMM ’95 Symposium, pages 299–313, Cambridge, MA, Sept. 1995.
    Google ScholarLocate open access versionFindings
  • J. C. Mogul and K. K. Ramakrishnan. Eliminating receive livelock in an interrupt-driven kernel. ACM Transactions on Computer Systems, 15(3):217–252, Aug. 1997.
    Google ScholarLocate open access versionFindings
  • V. N. Padmanabhan and R. H. Katz. TCP Fast Start: A Technique For Speeding Up Web Transfers. In Proceedings of the IEEE GLOBECOM ’98 Conference, pages 41–46, Sydney, Australia, Nov. 1998.
    Google ScholarLocate open access versionFindings
  • V. N. Padmanabhan and J. C. Mogul. Improving HTTP Latency. In Proceedings of the Second International WWW Conference, pages 995–1005, Chicago, IL, Oct. 1994.
    Google ScholarLocate open access versionFindings
  • V. S. Pai, P. Druschel, and W. Zwaenepoel. Flash: An efficient and portable Web server. In Proceeding of the Usenix 1999 Annual Technical Conference, pages 199– 212, Monterey, CA, June 1999.
    Google ScholarLocate open access versionFindings
  • V. Paxson. End-to-End Internet Packet Dynamics. In Proceedings of the ACM SIGCOMM ’97 Symposium, pages 139–152, Cannes, France, Sept. 1997.
    Google ScholarLocate open access versionFindings
  • RealPlayer. http://www.realplayer.com/.
    Findings
  • J. M. Smith and C. B. S. Traw. Giving applications access to Gb/s networking. IEEE Network, 7(4):44–52, July 1993.
    Google ScholarLocate open access versionFindings
  • G. Varghese and A. Lauck. Hashed and hierarchical timing wheels: Data structures for the efficient implementation of a timer facility. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles, pages 171–180, Austin, TX, Nov. 1987.
    Google ScholarLocate open access versionFindings
  • V. Visweswaraiah and J. Heidemann. Improving restart of idle TCP connections. Technical Report 97-661, University of Southern California, November 1997.
    Google ScholarFindings
  • J. Wroclawski. RFC 2211: Specification of controlled-load network element service, Sept. 1997. ftp://ftp.merit.edu/documents/rfc/rfc2211.txt.
    Google ScholarFindings
  • L. Zhang, S. Shenker, and D. D. Clark. Observations on the Dynamics of a Congestion Control Algorithm: The Effects of Two-Way Traffic. In Proceedings of the ACM SIGCOMM ’91 Symposium, pages 133–148, Zurich, Switzerland, 1991.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科