AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
Speculative execution in a distributed file system
Special Interest Group on Operating Systems, no. 5 (2006): 191-205
- Distributed file systems often perform substantially worse than local file systems because they perform synchronous I/O operations for cache coherence and data safety.
- File systems such as AFS  and NFS  present users with the abstraction of a single, coherent namespace shared across multiple clients.
- As the results show, even these weaker semantics are time-consuming
- Distributed file systems often perform substantially worse than local file systems because they perform synchronous I/O operations for cache coherence and data safety
- Whereas local file systems typically guarantee that a process that reads data from a file will see all modifications previously completed by other processes, distributed file systems such as AFS and NFS provide no such guarantee
- We have created a version of the Blue File System  that uses Speculator to provide single-copy semantics, in which the file consistency seen by two processes sharing a file and running on two different file clients is identical to the consistency that they would see if they were running on the same client
- We have shown that Speculator substantially improves the performance of existing distributed file systems
- We have shown how speculation enables the development of new file systems that are safe, consistent, and fast, even over high-latency links
- While our investigation to date has focused on distributed file systems, we believe that generic OS support for speculative execution and causal dependency tracking will prove useful in many other domains
- The authors use two Dell Precision 370 desktops as the client and file server.
- The authors run RedHat Enterprise Linux release 3 with kernel version 2.4.21.
- The authors run the nonspeculative version of NFS with both UDP and TCP.
- While BlueFS can cache data on local disk and portable storage, it uses only the Linux file cache in these experiments—this provides a fair comparison with NFS, which uses only the file cache.
- The client /tmp directory is a RAMFS memory-only file system for all tests
- Results from PostMark and
Andrew-style benchmarks show that Speculator improves the performance of NFS by more than a factor of 2 over local-area networks; over networks with 30 ms of round-trip latency, speculation makes NFS more than 14 times faster.
- The authors' version of BlueFS provides synchronous I/O in which all file modifications are safe on the server’s disk before an operation is observed to complete.
- Despite providing these strong guarantees, BlueFS is 66% faster than non-speculative NFS over a LAN and more than 11 times faster with a 30 ms delay
- Other file systems
While the authors have modified only NFS and BlueFS to use speculation, it is useful to consider how Speculator could benefit other distributed file systems.
- Since speculation improves performance by eliminating synchronous communication, the performance improvement seen by a particular file system will depend on how often it performs synchronous operations
- Both NFS and BlueFS implement cache coherence by polling the file server to verify that cached files are up-todate.
- Echo  uses leases to provide single-copy consistency; a client granted an exclusive lease on an object can read or modify that object without contacting the server.Speculator supports multi-process speculative execution within a commodity OS kernel.
- The authors' future plans include investigating what other applications can benefit from Speculator
- To the best of our knowledge, Speculator is the first support for multi-process speculative execution in a commodity operating system and the first use of speculative execution to improve cache coherence and write throughput in distributed file systems.
Chang and Gibson  and Fraser and Chang  use speculative execution to generate I/O prefetch hints for a local file system. In their work, the speculative thread executes only when the foreground process blocks on disk I/O. When the speculative thread attempts to read from disk, a prefetch hint is generated and fake data is returned so that the thread can continue to execute. Their work improves read performance through prefetching, whereas Speculator improves read performance by reducing the cost of cache coherence. Speculator also allows write-path optimizations such as group commit. In Speculator, speculative processes commit their work in the common case where speculations are correct. However, since Chang’s speculative threads do not see correct file data, any computation done by a speculative thread must be later re-done by a non-speculative thread. Speculator also allows multiple processes to participate in a speculation; Chang’s speculative threads are not allowed to communicate with other processes.
- The work is supported by the National Science Foundation under award CCR-0509093
- NSF has also supported development of the Blue File System through award CNS-0306251
- Jason Flinn is supported by NSF CAREER award CNS-0346686
- Adya, A., Bolosky, W. J., Castro, M., Cermak, G., Chaiken, R., Douceur, J. R., Howell, J., Lorch, J. R., Theimer, M., and Wattenhofer, R. P. FARSITE: Federated, available, and reliable storage for an incompletely trusted environment. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (Boston, MA, December 2002), pp. 1–14.
- Birrell, A. D., Hisgen, A., Jerian, C., Mann, T., and Swart, G. The Echo distributed file system. Tech. Rep. 111, Digital Equipment Corporation, Palo Alto, CA, USA, October 1993.
- Callaghan, B., Pavlowski, B., and Staubach, P. NFS Version 3 Protocol Specification. Tech. Rep. RFC 1813, IETF, June 1995.
- Carson, M. Adaptation and Protocol Testing thorugh Network Emulation. NIST, http://snad.ncsl.nist.gov/itg/nistnet/slides/index.htm.
- Castro, M., and Liskov, B. Proactive recovery in a byzantine-fault-tolerant system. In Proceedings of the 4th Symposium on Operating Systems Design and Implementation (San Diego, CA, October 2000).
- Chang, F., and Gibson, G. Automatic I/O hint generation through speculative execution. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (New Orleans, LA, February 1999), pp. 1–14.
- Cheriton, D., and Duda, K. Logged virtual memory. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (Copper Mountain, CO, Dec. 1995), pp. 26–39.
- Elnozahy, E. N., Alvisi, L., Wang, Y.-M., and Johnson, D. B. A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys 34, 3 (September 2002), 375–408.
- Franklin, M., and Sohi, G. ARB: A hardware mechanism for dynamic reordering of memory references. IEEE Transactions on Computers 45, 5 (May 1996), 552–571.
- Fraser, K., and Chang, F. Operating system I/O speculation: How two invocations are faster than one. In Proceedings of the 2003 USENIX Technical Conference (San Antonio, TX, June 2003), pp. 325–338.
- Haerder, T., and Reuter, A. Principles of Transaction-Oriented Database Recovery. ACM Computing Surveys 15, 4 (December 1983), 287–317.
- Hammond, L., Willey, M., and Olukotun, K. Data speculation support for a chip multiprocessor. In Proc. of the 8th Intl. ACM Conf. on Arch. Support for Programming Languages and Operating Systems (San Jose, CA, October 1998), pp. 58–69.
- Howard, J. H., Kazar, M. L., Menees, S. G., Nichols, D. A., Satyanarayanan, M., Sidebotham, R. N., and West, M. J. Scale and performance in a distributed file system. ACM Transactions on Computer Systems 6, 1 (February 1988).
- Jefferson, D. Virtual time. ACM Transactions on Programming Languages and Systems 7, 3 (July 1985), 404–425.
- Jefferson, D., Beckman, B., Wieland, F., Blume, L., DiLoreto, M., P.Hontalas, Laroche, P., Sturdevant, K., Tupman, J., Warren, V., Weidel, J., Younger, H., and Bellenot, S. Time Warp operating system. In Proceedings of the 11th ACM Symposium on Operating Systems Principles (Austin, TX, November 1987), pp. 77–93.
- Katcher, J. PostMark: A new file system benchmark. Tech. Rep. TR3022, Network Appliance, 1997.
- King, S. T., and Chen, P. M. Backtracking intrusions. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (Bolton Landing, NY, October 2003), pp. 223–236.
- Kistler, J. J., and Satyanarayanan, M. Disconnected operation in the Coda file system. ACM Transactions on Computer Systems 10, 1 (February 1992).
- Lamport, L. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (1978), 558–565.
- Li, J., Krohn, M., Mazieres, D., and Shasha, D. Secure untrusted data repository (SUNDR). In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (San Francisco, CA, December 2004), pp. 121–136.
- Liskov, B., and Rodrigues, R. Transactional file systems can be fast. In Proceedings of the 11th SIGOPS European Workshop (Leuven, Belgium, September 2004).
- Mazieres, D., Kaminsky, M., Kaashoek, M. F., and Witchel, E. Separating key management from file system security. In Proceedings of the 17th ACM Symposium on Operating Systems Principles (Kiawah Island, SC, December 1999), pp. 124–139.
- Nelson, M. N., Welsh, B. B., and Ousterhout, J. K. Caching in the Sprite network file system. ACM Transactions on Computer Systems 6, 1 (1988), 134–154.
- Nightingale, E. B., and Flinn, J. Energy-efficiency and storage flexibility in the Blue File System. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (San Francisco, CA, December 2004), pp. 363–378.
- Rosenblum, M., Bugnion, E., Herrod, S. A., Witchel, E., and Gupta, A. The impact of architectural trends on operating system performance. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (Copper Mountain, CO, December 1995), pp. 285–298.
- Schmuck, F., and Wyllie, J. Experience with transactions in QuickSilver. In Proceedings of the 13th ACM Symposium on Operating Systems Principles (October 1991), pp. 239–53.
- Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame, C., Eisler, M., and Noveck, D. Network File System (NFS) version 4 Protocol. Tech. Rep. RFC 3530, IETF, April 2003.
- Spector, A. Z., Daniels, D., Duchamp, D., Eppinger, J. L., and Pauch, R. Distributed transactions for reliable systems. In Proceedings of the 10th ACM Symposium on Operating Systems Principles (Orcas Island, WA, December 1985), pp. 127–146.
- Srinivasan, S., Andrews, C., Kandula, S., and Zhou, Y. Flashback: A light-weight extension for rollback and deterministic replay for software debugging. In Proceedings of the 2004 USENIX Technical Conference (Boston, MA, June 2004).
- Srinivasan, V., and Mogul, J. Spritely NFS: Experiments with cache consistency protocols. In Proceedings of the 12th ACM Symposium on Operating System Principles (December 1989), pp. 45–57.
- Steffan, J. G., Colohan, C. B., Zhai, A., and Mowry, T. C. A scalable approach to thread-level speculation. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA) (Vancouver, Canada, June 2000), pp. 1–24.
- Weinstein, M. J., Thomas W. Page, J., Livezey, B. K., and Popek, G. J. Transactions and synchronization in a distributed operating system. In Proceedings of the 10th ACM Symposium on Operating Systems Principles (Orcas Island, WA, December 1985), pp. 115–126.
- Zhang, Y., Rauchwerger, L., and Torrellas, J. Hardware for speculative parallelization of partially-parallel loops in DSM multiprocessors. In Proc. of the 5th Intl. Symposium on High Performance Computer Architecture (Orlando, FL, January 1999), p. 135.
- Zhu, N., and Chiueh, T. Design, implementation and evaluation of the Repairable File Service. In Proceedings of the International Conference on Dependable Systems and Networks (San Francisco, CA, June 2003).