A Study on the Application of TensorFlow Compression Techniques to Human Activity Recognition

IEEE Access(2023)

引用 0|浏览5
暂无评分
摘要
In the human activity recognition (HAR) application domain, the use of deep learning (DL) algorithms for feature extractions and training purposes delivers significant performance improvements with respect to the use of traditional machine learning (ML) algorithms. However, this comes at the expense of more complex and demanding models, making harder their deployment on constrained devices traditionally involved in the HAR process. The efficiency of DL deployment is thus yet to be explored. We thoroughly investigated the application of TensorFlow Lite simple conversion, dynamic, and full integer quantization compression techniques. We applied those techniques not only to convolutional neural networks (CNNs), but also to long short-term memory (LSTM) networks, and a combined version of CNN and LSTM. We also considered two use case scenarios, namely cascading compression and stand-alone compression mode. This paper reports the feasibility of deploying deep networks onto an ESP32 device, and how TensorFlow compression techniques impact classification accuracy, energy consumption, and inference latency. Results show that in the cascading case, it is not possible to carry out the performance characterization. Whereas in the stand-alone case, dynamic quantization is recommended because yields a negligible loss of accuracy. In terms of power efficiency, both dynamic and full integer quantization provide high energy saving with respect to the uncompressed models: between 31% and 37% for CNN networks, and up to 45% for LSTM networks. In terms of inference latency, dynamic and full integer quantization provide comparable performance.
更多
查看译文
关键词
Compression techniques,deep learning,dynamic quantization,ESP32,full integer quantization,human activity recognition,microcontrollers,pruning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要