AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We describe how software and hardware schemes can combine seamlessly into a hybrid system in support of transactional programs, allowing use of low-cost hardware TM when it works, but reverting to Software transactional memory when it doesn’t

Hybrid STM/HTM for nested transactions on OpenJDK.

OOPSLA, pp.660-676, (2016)

Cited by: 10|Views268
EI

Abstract

Transactional memory (TM) has long been advocated as a promising pathway to more automated concurrency control for scaling concurrent programs running on parallel hardware. Software TM (STM) has the benefit of being able to run general transactional programs, but at the significant cost of overheads imposed to log memory accesses, mediate...More

Code:

Data:

0
Introduction
  • Transactional memory (TM) allows programmers to group memory operations into transactions that appear to execute atomically: no transaction sees the intermediate states of other transactions executing in other threads, and all work of a transaction either happens or not.
  • A natural form of nesting for transactional constructs in a programming language is linear nesting, which allows a parent transaction to invoke a sequence of sub-operations, some of which may themselves execute as child subtransactions.
  • How these subtransactions are managed may vary, so long as the atomicity of the parent transaction is preserved.
Highlights
  • Transactional memory (TM) allows programmers to group memory operations into transactions that appear to execute atomically: no transaction sees the intermediate states of other transactions executing in other threads, and all work of a transaction either happens or not
  • We describe heuristics used to make this choice dynamically and automatically, but allowing the transition back to hardware TM (HTM) opportunistically
  • Using a standard synthetic benchmark we demonstrate that HTM offers significant acceleration of both closed and open nested transactions, while yielding parallel scaling up to the limits of the hardware, whereupon scaling in software continues but with the penalty to throughput imposed by software mechanisms
  • Using the Intel Software Development Emulator (SDE), we found these aborts to be caused by execution of instructions that are incompatible with Transactional Synchronization Extensions (TSX) [14]—FXRSTOR and FXSAVE—and which are compiled into HotSpot’s run-time stubs used to control dynamic optimization and linking, and to resolve Java static and virtual method calls
  • Our experiments explore a range of structured transactions, namely flat, closed, open, and boosted, in Software transactional memory (STM)-only mode and in self-tuning hybrid HTM/STM mode
  • Our results demonstrate the utility of nesting as a means to achieving reliably scalable concurrent manipulation of data structures using open/closed nesting, without the need for hand-tuned and hand-coded non-blocking implementations
Methods
  • The authors' experiments explore a range of structured transactions, namely flat, closed, open, and boosted, in STM-only mode and in self-tuning hybrid HTM/STM mode.
  • The initial number of elements added to the data structure before measurement begins.
  • The number of iterations of the benchmark.
  • The authors first pin threads to different cores on one socket, on the socket, before assigning threads to different hyperthreads of the same core.
  • Exploratory experiments showed this strategy to be clearly the best
Results
  • The authors present results for executing the workload under different transaction implementations.
  • A common theme in the results is that open nesting and boosting do not perform well when the transaction size is small
  • This is because these transaction forms carry a certain amount of overhead—prominent at transaction size 1, for example.
  • For each nested operation, the inner transaction needs to create an abort handler and log it
  • These costs become smaller in a relative sense as transaction size increases, giving these forms better performance and scaling at larger transaction sizes
Conclusion
  • The authors' results demonstrate the utility of nesting as a means to achieving reliably scalable concurrent manipulation of data structures using open/closed nesting, without the need for hand-tuned and hand-coded non-blocking implementations.
  • Long as the underlying data structure is friendly to transactions it can be nested.
  • The authors' results indicate the degree to which hyperthreading degrades performance of HTM schemes due to the need to share capacity between hyperthreads on the same core.
  • Programmers must choose carefully which technique to employ, depending on the nature of their programs
Tables
  • Table1: Synchrobench parameters for experiments as a reference point. We conducted all experiments using the extended version of Synchrobench described in Section 4 with the parameters shown in Table 1
Download tables as Excel
Related work
  • We now briefly discuss other related work, before describing in later sections how to present our abstractions to the programmer, how they can be implemented so as to be compatible with and amenable to HTM acceleration, and experiments showing the impact they have on performance.

    There are previous hybrid STM/HTM implementations, such as HyTM [5, 18]. Their approach is similar to ours, where they generate separate software paths for HTM and STM with instrumentation to check the needed metadata. HyTM supported two simple back-off schemes to transition from HTM to STM in the face of failures. In the “immediate fail-over” scheme a transaction failing in HTM retries itself in STM immediately. In the “back-off” scheme, a transaction failing in HTM retries for 10 times before retrying under STM. Since their transactions were very short and with small memory footprint, their simple approach of trying HTM first for every transaction was a successful policy. Matveev and Shavit [20] describe a similar back-off policy.
Funding
  • This material is based upon work supported by the National Science Foundation under Grants CCF-1408896, CCF1409284, CNS-1405939, CNS-1161237, and CNS-1162246
Reference
  • K. Chapman, A. L. Hosking, J. E. B. Moss, and T. Richards. Closed and open nested atomic actions for Java: Language design and prototype implementation. In International Conference on Principles and Practice of Programming on the Java Platform: Virtual Machines, Languages, and Tools, pages 169–180, Cracow, Poland, Oct. 2014. doi: 1.1145/26475 8. 2647525.
    Google ScholarLocate open access versionFindings
  • T. Crain, V. Gramoli, and M. Raynal. A contention-friendly methodology for search structures. Research report, INRIA, Feb. 201URL https://hal.inria.fr/hal-668 1.
    Findings
  • T. Crain, V. Gramoli, and M. Raynal. A speculation-friendly binary search tree. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 161–170, New Orleans, Louisiana, Feb. 2012. doi: 1.1145/2145816. 2145837.
    Google ScholarLocate open access versionFindings
  • L. Dalessandro, F. Carouge, S. White, Y. Lev, M. Moir, M. L. Scott, and M. F. Spear. Hybrid NOrec: A case study in the effectiveness of best effort hardware transactional memory. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 39–52, Newport Beach, California, Mar. 2011. doi: 1.1145/ 195 365.195 373.
    Google ScholarLocate open access versionFindings
  • P. Damron, A. Fedorova, Y. Lev, V. Luchangco, M. Moir, and D. Nussbaum. Hybrid transactional memory. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 336–346, San Jose, California, Oct. 2006. doi: 1.1145/1168857.11689.
    Google ScholarLocate open access versionFindings
  • D. Dice, Y. Lev, M. Moir, and D. Nussbaum. Early experience with a commercial hardware transactional memory implementation. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 157–168, Washington, DC, Mar. 2009. doi: 1.1145/15 8244.15 8263.
    Google ScholarLocate open access versionFindings
  • D. Dice, A. Kogan, and Y. Lev. Refined transactional lock elision. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 19:1–19:12, Barcelona, Spain, Mar. 2016. doi: 1.1145/2851141.2851162.
    Google ScholarLocate open access versionFindings
  • N. Diegues and P. Romano. Self-tuning Intel transactional synchronization extensions. In USENIX International Conference on Autonomic Computing, pages 209–219, Philadelphia, PA, June 2014. URL https://www.usenix.org/conference/icac14/technical-sessions/presentation/diegues.
    Locate open access versionFindings
  • N. Diegues, P. Romano, and L. Rodrigues. Virtues and limitations of commodity hardware transactional memory. In International Conference on Parallel Architectures and Compilation Techniques, pages 3–14, Aug. 2014. doi: 1. 1145/2628 71.2628 8.
    Google ScholarLocate open access versionFindings
  • F. Ellen, P. Fatourou, E. Ruppert, and F. van Breugel. Nonblocking binary search trees. In ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, pages 131– 140, Zürich, Switzerland, July 20doi: 1.1145/1835698. 1835736.
    Google ScholarLocate open access versionFindings
  • V. Gramoli. More than you ever wanted to know about synchronization: Synchrobench, measuring the impact of the synchronization on concurrent algorithms. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1–10, San Francisco, California, Feb. 2015. doi: 1.1145/26885.26885 1.
    Google ScholarLocate open access versionFindings
  • M. Herlihy and E. Koskinen. Transactional boosting: A methodology for highly-concurrent transactional objects. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 207–216, Salt Lake City, Utah, Feb. 2008. doi: 1.1145/13452 6.1345237.
    Google ScholarLocate open access versionFindings
  • M. P. Herlihy and J. M. Wing. Linearizability: A correctness criterion for concurrent objects. ACM Trans. Prog. Lang. Syst., 12(3):463–492, July 1990. doi: 1.1145/78969.78972.
    Google ScholarLocate open access versionFindings
  • Intel. Intel Transactional Synchronization Extensions (Intel TSX) Programming Considerations. URL https://software.intel.com/en-us/node/582935.
    Findings
  • C. Jacobi, T. Slegel, and D. Greiner. Transactional memory architecture and implementation for IBM system Z. In International Symposium on Microarchitecture, pages 25–36, Dec. 2012. doi: 1.11 9/MICRO.2 12.12.
    Google ScholarLocate open access versionFindings
  • A. Kasko, S. Kobylyanskiy, and A. Mironchenko. OpenJDK Cookbook. Packt Publishing, Jan. 2015. ISBN 1849698406.
    Google ScholarFindings
  • G. Korland, N. Shavit, and P. Felber. Noninvasive concurrency with Java STM. In Workshop on Programmability Issues for Heterogeneous Multicores, Pisa, Italy, Jan. 2010. URL http://www.velox-project.eu/sites/default/files/multiprog1.pdf.
    Findings
  • S. Kumar, M. Chu, C. J. Hughes, P. Kundu, and A. Nguyen. Hybrid transactional memory. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 209–220, Mar. 2006. doi: 1.1145/1122971.1123 3.
    Google ScholarLocate open access versionFindings
  • Y. Lev, M. Moir, and D. Nussbaum. PhTM: Phased transactional memory. In ACM SIGPLAN Workshop on Transactional Computing, Aug. 2007. URL https://www.cs.rochester.edu/meetings/TRANSACT 7/papers/lev.pdf.
    Locate open access versionFindings
  • A. Matveev and N. Shavit. Reduced hardware NORec: A safe and scalable hybrid transactional memory. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 59–71, Istanbul, Turkey, Mar. 2015. doi: 1.1145/2694344.2694393.
    Google ScholarLocate open access versionFindings
  • J. E. B. Moss. Nested transactions: an approach to reliable distributed computing. PhD thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, 1981.
    Google ScholarFindings
  • J. E. B. Moss and A. L. Hosking. Nested transactional memory: model and architecture sketches. Science of Computer Programming, 63:186–201, Dec. 2006. doi: 1.1 16/j.scico. 2 6. 5. 1.
    Google ScholarLocate open access versionFindings
  • Y. Ni, V. Menon, A.-R. Adl-Tabatabai, A. L. Hosking, R. L. Hudson, J. E. B. Moss, B. Saha, and T. Shpeisman. Open nesting in software transactional memory. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 68–78, San Jose, California, Mar. 2007. doi: 1.1145/1229428.1229442.
    Google ScholarLocate open access versionFindings
  • M. Paleczny, C. Vick, and C. Click. The Java Hotspot server compiler. In USENIX Java Virtual Machine Research and Technology Symposium, Monterey, California, Apr. 2001. URL https://www.usenix.org/legacy/events/jvm 1/full_papers/paleczny/paleczny.pdf.
    Locate open access versionFindings
  • R. Rajwar and J. R. Goodman. Speculative lock elision: enabling highly concurrent multithreaded execution. In International Symposium on Microarchitecture, pages 294– 305, Austin, Texas, Dec. 2001. doi: 1.11 9/MICRO.2 1. 991127.
    Google ScholarLocate open access versionFindings
  • T. Riegel, P. Marlier, M. Nowack, P. Felber, and C. Fetzer. Optimizing hybrid transactional memory: The importance of nonspeculative operations. In ACM Symposium on Parallelism in Algorithms and Architectures, pages 53–64, San Jose, California, June 2011. doi: 1.1145/1989493.19895 1.
    Google ScholarLocate open access versionFindings
  • R. M. Yoo, C. J. Hughes, K. Lai, and R. Rajwar. Performance evaluation of Intel transactional synchronization extensions for high-performance computing. In International Conference on High Performance Computing, Networking, Storage and Analysis, pages 19:1–19:11, Denver, Colorado, Nov. 2013. doi: 1.1145/25 321.25 3232.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科