Fast Quadruple Precision Arithmetic Library on Parallel Computer SR11000/J2

COMPUTATIONAL SCIENCE - ICCS 2008, PT 1(2008)

引用 2|浏览0
暂无评分
摘要
In this paper, the fast quadruple precision arithmetic of four kinds of basic operations and multiply-add operations are introduced. The proposed methods provide a maximum speed-up factor of 5 times to gcc 4.1.1 with POWER 5+ processor used on parallel computer SR11000/J2. We also developed the fast quadruple precision vector library optimized on POWER 5 architecture. Quadruple precision numbers, which is 128 bit long double data type, are emulated with a pair of 64 bit double data type on POWER 5+ prosessor used on SR11000/J2 with Hitachi Optimizing Compiler and gcc 4.1.1. To avoid rounding errors in computing quadruple precision arithmetic operations, emulation needs high computational cost. The proposed methods focus on optimizing the number of registers and instruction latency.
更多
查看译文
关键词
basic operation,quadruple precision number,parallel computer sr11000,high computational cost,hitachi optimizing compiler,fast quadruple precision arithmetic,fast quadruple precision vector,double data type,quadruple precision arithmetic operation,parallel computer,data type,optimizing compiler
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要