ESPnet-ONNX: Bridging a Gap Between Research and Production

arxiv(2022)

引用 1|浏览20
暂无评分
摘要
In the field of deep learning, researchers often focus on inventing novel neural network models and improving benchmarks. In contrast, application developers are interested in making models suitable for actual products, which involves optimizing a model for faster inference and adapting a model to various platforms (e.g., C++ and Python). In this work, to fill the gap between the two, we establish an effective procedure for optimizing a PyTorch-based research-oriented model for deployment, taking ESPnet, a widely used toolkit for speech processing, as an instance. We introduce different techniques to ESPnet, including converting a model into an ONNX format, fusing nodes in a graph, and quantizing parameters, which lead to approximately 1.3-2$\times$ speedup in various tasks (i.e., ASR, TTS, speech translation, and spoken language understanding) while keeping its performance without any additional training. Our ESPnet-ONNX will be publicly available at https://github.com/espnet/espnet_onnx
更多
查看译文
关键词
production,research,espnet-onnx
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要