PeakEngine: A Deterministic On-the-Fly Pruning Neural Network Accelerator for Hearing Instruments

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS(2024)

引用 0|浏览0
暂无评分
摘要
Recurrent neural networks (RNNs) are well-suited for sequential tasks such as speech enhancement (SE). However, their performance comes with high-computational complexity and latency. This impedes their deployment to battery-powered and resource-constrained hearing instruments (HIs) that need to operate for 16-18 h daily at only a few milliwatts (mW). In this article, we introduce PeakEngine, a configurable ASIC accelerator that decreases the amount of computation and memory accesses, and thus latency, in a gated recurrent unit (GRU) by means of adaptive inference. The reduction is achieved by on-the-fly pruning that selects the top K elements based on magnitudes of delta changes across timesteps from both input and hidden state sequences. Since K is constant, it results in a deterministic execution time. PeakEngine is synthesized in a 22-nm CMOS process, and the simulations show that it dissipates 11.83 mu J per inference for the baseline (unpruned) network and only 4.14-5.04 mu J for the pruned networks, with maximum acceptable degradation to no degradation in the improvement in audio quality and intelligibility. Moreover, the inference is on average sped up 2.2-2.97 times , hence meeting the real-time requirements imposed by a HI application. To the best of our knowledge, PeakEngine is the first ASIC accelerator for deterministic and dynamic pruning in RNNs targeting HIs and SE.
更多
查看译文
关键词
Deterministic execution time,dynamic pruning,hardware accelerator,hearing instruments (HIs),min-heap,recurrent neural networks (RNNs),speech enhancement (SE),top K
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要