Efficient Data Communication Between Cpu And Gpu Through Transparent Partial-Page Migration

IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS)(2018)

引用 5|浏览50
暂无评分
摘要
Despite the increasing investment in integrated GPUs and next-generation interconnect research, discrete GPUs connected by PCI Express still account for the dominant position of the market, the management of data communication between CPU and GPU continues to evolve. Initially, the programmer controls the data transfer between CPU and GPU explicitly. To simplify programming and enable system-wide atomic memory operations, GPU vendors have developed a programming model that provides a single virtual address space. The page migration engine in this model migrates pages between CPU and GPU on demand automatically. To meet the needs of high-performance workloads, the page size tends to be larger. Limited by low bandwidth and high latency interconnects, larger page migration has longer delay, which may reduce the overlap of computation and transmission and cause serious performance decline. In this paper, we propose partial-page migration that only migrates the requested part of a page to shorten the migration latency and avoid the performance degradation of the whole-page migration when the page becomes larger. Experiments show that partialpage migration is possible to significantly hide the performance overheads of whole-page migration when the page size is 2MB and the PCI Express bandwidth is 16GB/sec, converting an average 72.72x slowdown to a 1.29x speedup when compared with programmers controlled data transmission. Additionally, we examine the impact of page size on TLB miss rate and the performance impact of migration unit size on execution time, enabling designers to make informed decisions.
更多
查看译文
关键词
unified memory, data communication, partial page migration<bold>, </bold>
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要