CSB-RNN: A Super Real-time RNN Framework with Compressed Structured Block

semanticscholar(2020)

引用 0|浏览1
暂无评分
摘要
This paper presents CSB-RNN, an optimized full-stack RNN framework with the novel compressed structured block (CSB) technique. The CSB-pruned RNN model comes with both fine-granularity that benefits the pruning rate and regular structure that facilitates the hardware-parallelism. Further, we propose a novel hardware architecture for inferencing the CSB-pruned model, which solves the block-workload imbalance issue and achieves an over 95% hardware utilization. CSB-RNN achieves 1.7×-3.6× improvement on the pruning rate comparing to the prior art. With the addition of novel architecture, the compressed-RNN inference reaches a super real-time latency of 23μs-67μs on FPGA implementation.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要