Fast Interface with Ensemble Ternary Neural Network

2022 IEEE 52nd International Symposium on Multiple-Valued Logic (ISMVL)(2022)

引用 0|浏览15
暂无评分
摘要
The use of machine learning is expanding in various applications such as image processing in data centers. With the spread of deep learning, neural-network-based models have frequently been adopted in recent years. Because the processing speed is slow when evaluating machine learning on a CPU, a fast-dedicated hardware accelerator is often used. In particular, the demand for hardware accelerators in data centers is increasing. A low power consumption and high-speed processing are required in a limited space. An implementation method for a ternary neural network utilizing the rewritable look-up table (LUT) of a field-programmable gate array (FPGA) is proposed. Binary/ternary neural networks, which are quantized to 1–2 bits for mapping to LUTs, suffer from a poor recognition accuracy. To prevent a decrease in the recognition accuracy, let $q$ be the number of quantization bits to be stored in the LUT. The memory size of the LUT becomes $O(2^{q})$ bits, and it tends to be exponential in size. We improved the accuracy using an ensemble ternary neural network. There are various ways to select data for ensembles during training, and various ways to select branch pruning for trivialized neural networks. We chose the greedy method for our design. An evaluation using various benchmark datasets showed that the ensemble approach achieved a recognition accuracy equivalent to that of the 32-bit float model. We also estimated the amount of memory required to implement an LUT for an ensemble ternary neural network. The size of the LUT is 1.9 Mbit, which can be realized on the current FPGAs.
更多
查看译文
关键词
Ternary Neural Networks,Ensemble,Embedded System
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要