FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow.

IEEE Transactions on Very Large Scale Integration (VLSI) Systems(2016)

引用 21|浏览118
暂无评分
摘要
High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture (CUDA), enables efficient description and implementation of independent computation cores. HLS tools can effectively translate the many threads of computation present in the parallel descriptions into independent, optimized cores. The generated hardware cores often heavily share input data ...
更多
查看译文
关键词
Graphics processing units,Bandwidth,Kernel,Ports (Computers),Field programmable gate arrays,Hardware,Parallel processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要