FIST-HOSVD: fused in-place sequentially truncated higher order singular value decomposition.
Platform for Advanced Scientific Computing Conference (PASC)(2022)
摘要
In this paper, several novel methods of improving the memory locality of the Sequentially Truncated Higher Order Singular Value Decomposition (ST-HOSVD) algorithm for computing the Tucker decomposition are presented. We show how the two primary computational kernels of the ST-HOSVD can be fused together into a single kernel to significantly improve memory locality. We then extend matrix tiling techniques to tensors to further improve cache utilization. This block-based approach is then coupled with a novel in-place transpose algorithm to drastically reduce the memory requirements of the algorithm by overwriting the original tensor with the result. Our approach's effectiveness is demonstrated by comparing the multi-threaded performance of our optimized ST-HOSVD algorithm to TuckerMPI, a state-of-the-art ST-HOSVD implementation, in compressing two combustion simulation datasets. We demonstrate up to ~ 135x reduction in auxiliary memory consumption thereby increasing the problem size that can be computed for a given memory allocation by up to ~ 3x, whilst maintaining comparable runtime performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要