Recursive MaxSquare: Cache-friendly, Parallel, Scalable in situ Rectangular Matrix Transposition

Claudio A. Parra, Travis Yu, Kyu Seon Yum, Arturo Garza,Isaac D. Scherson

international conference on computational science(2020)

引用 0|浏览0
暂无评分
摘要
An in situ rectangular matrix transposition algorithm is presented based on recursively partitioning an original rectangular matrix into a maximum size square matrix and a remaining rectangular sub-matrix. To transpose the maximum size square sub-matrix, a novel cache-friendly, parallel (multithreaded) and scalable in-place square matrix transposition procedure is proposed: it requires a total of Θ(n2/2) simple memory swaps, a single element temporary storage per thread, and does not make use of complex index arithmetic in the main transposition loop. Recursion is used to transpose the remaining rectangular sub-matrix. Dubbed Recursive MaxSquare, the novel proposed rectangular matrix in-place transposition algorithm uses a generalization of the perfect shuffle/unshuffle data permutation to stitch together the recursively transposed square matrices. The shuffle/unshuffle permutations are shown to be efficiently decomposed using basic vector/segment swaps, exchanges and/or cyclic shifts (rotations). A balanced parallel cycles-based transposition is also proposed for comparison.
更多
查看译文
关键词
rectangular matrix transposition,shuffle,unshuffle,cache-friendly,multicore,multithread
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要