a-Tucker: fast input-adaptive and matricization-free Tucker decomposition of higher-order tensors on GPUs

CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING(2022)

引用 0|浏览6
暂无评分
摘要
Tucker decomposition is one of the most popular models for analyzing and compressing large-scale tensorial data. Existing Tucker decomposition algorithms are usually based on a single solver to compute the factor matrices and intermediate tensor in a predetermined order, and are not flexible enough to adapt with the diversities of the input data and the hardware. Moreover, to exploit highly efficient matrix multiplication kernels, most Tucker decomposition implementations rely on explicit matricizations, which could introduce extra costs of data conversion. In this paper, we present a-Tucker, a new framework for input-adaptive and matricization-free Tucker decomposition of higher-order tensors on GPUs. A two-level flexible Tucker decomposition algorithm is proposed to enable the switch of different calculation orders and different factor solvers, and a machine-learning adaptive order-solver selector is applied to automatically cope with change of the application scenarios. To further improve the performance, we implement a-Tucker in a fully matricization-free manner without any conversion between tensors and matrices. Experiments show that a-Tucker can substantially outperform existing works while keeping similar accuracy with a variety of synthetic and real-world tensors.
更多
查看译文
关键词
Tensor computation,Tucker decomposition,Higher-order singular value decomposition,Input-adaptive,Matricization-free,GPU computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要