Multi-core Implementations of the Concurrent Collections Programming Model

semanticscholar(2008)

引用 77|浏览1
暂无评分
摘要
In this paper we introduce the Concurrent Collections programming model, which builds on past work on TStreams [8]. In this model, programs are written in terms of high-level application-specific operations. These operations are partially ordered according to only their semantic constraints. These partial orderings correspond to data flow and control flow. This approach supports an important separation of concerns. There are two roles involved in implementing a parallel program. One is the role of a domain expert, the developer whose interest and expertise is in the application domain, such as finance, genomics, or numerical analysis. The other is the tuning expert, whose interest and expertise is in performance, including performance on a particular platform. These may be distinct individuals or the same individual at different stages in application development. The tuning expert may in fact be software (such as a static or dynamic optimizing compiler). The Concurrent Collections programming model separates the work of the domain expert (the expression of the semantics of the computation) from the work of the tuning expert (selection and mapping of actual parallelism to a specific architecture). This separation simplifies the task of the domain expert. Writing in this language does not require any reasoning about parallelism or any understanding of the target architecture. The domain expert is concerned only with his or her area of expertise (the semantics of the application). This separation also simplifies the work of the tuning expert. The tuning expert is given the maximum possible freedom to map the computation onto the target architecture and is not required to have any understanding of the domain (as is often the case for compilers). We describe two implementations of the Concurrent Collections programming model. One is IntelR © Concurrent Collections for C/C++ based on IntelR © Threaded Building Blocks. The other is an X10-based implementation from the Habanero project at Rice University. We compare the implementations by showing the results achieved on multi-core SMP machines when executing the same Concurrent Collections application, Cholesky factorization, in both these approaches.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要