POSTER: Location-Aware Computation Mapping for Manycore Processors
2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)(2017)
摘要
Employing an on-chip network in a manycore system (to improve scalability) makes the latencies of data accesses issued by a core non-uniform, which significant impact application performance. This paper presents a compiler strategy which involves exposing architecture information to the compiler to enable optimized computation-to-core mapping. Our scheme takes into account the relative positions of (and distances between) cores, last-level caches (LLCs) and memory controllers (MCs) in a manycore system, and generates a mapping of computations to cores with the goal of minimizing the on-chip network traffic. Our experiments of 12 multi-threaded applications reveal that, on average, our approach reduces the on-chip network latency in a 6x6 manycore system by 49.5% in the case of private LLCs and 52.7% in the case of shared LLCs. These improvements translate to the corresponding execution time improvements of 14.8% and 15.2% for the private LLC and shared LLC based systems.
更多查看译文
关键词
location-aware computation mapping,manycore processors,data accesses,core nonuniform,significant impact application performance,compiler strategy,architecture information,optimized computation-to-core mapping,memory controllers,on-chip network traffic,6x6 manycore system,corresponding execution time improvements,multithreaded applications,shared LLC,MC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络