A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors

TACO(2013)

引用 47|浏览17
暂无评分
摘要
Recent architectural trends have focused on increased parallelism via multicore processors and increased heterogeneity via accelerator devices (e.g., graphics-processing units, field-programmable gate arrays). Although these architectures have significant performance and energy potential, application designers face many device-specific challenges when choosing an appropriate accelerator or when customizing an algorithm for an accelerator. To help address this problem, in this article we thoroughly evaluate convolution, one of the most common operations in digital-signal processing, on multicores, graphics-processing units, and field-programmable gate arrays. Whereas many previous application studies evaluate a specific usage of an application, this article assists designers with design space exploration for numerous use cases by analyzing effects of different input sizes, different algorithms, and different devices, while also determining Pareto-optimal trade-offs between performance and energy.
更多
查看译文
关键词
different device,different algorithm,appropriate accelerator,different input size,field-programmable gate array,graphics-processing unit,accelerator device,multicore processor,previous application study,energy comparison,energy potential,application designer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要