Universal Numerical Encoder and Profiler Reduces Computing's Memory Wall with Software, FPGA, and SoC Implementations

2013 DATA COMPRESSION CONFERENCE (DCC)(2013)

引用 8|浏览0
暂无评分
摘要
Summary form only given. Numerical computations have accelerated significantly since 2005 thanks to two complementary, silicon-enabled trends: multi-core processing and single instruction, multiple data (SIMD) accelerators. Unfortunately, due to fundamental limitations of physics, these two trends could not be accompanied by a corresponding increase in memory, storage, and I/O bandwidth. High-performance computing (HPC) is the proverbial “canary in the coal mine” of multi-core processing. When HPC hits a multi-core will likely encounter a similar limit in few years. We describe the computationally efficient (Fig 1b) and adaptive APplication AXceleration (APAX) numerical encoding method to reduce the memory wall for integers and floating-point operands. APAX achieves encoding rates between 3:1 and 10:1 without changing the dataset's statistical or spectral characteristics. APAX encoding takes advantage of three characteristics of all numerical sequences: peak-to-average ratio, oversampling, and effective number of bits (ENOB). Uncertainty quantification and spectral methods quantify the degree of uncertainty (accuracy) in numerical datasets. APAX profiler creates a rate-correlation graph with recommended operating signals, and fundamental limit, consumer point, provides 18 quantitative metrics comparing the original and decoded displays input and residual spectra with a residual histogram. On 24 integer and floating-point HPC datasets taken from climate, multi-physics, and seismic simulations, APAX averaged 7.95:1 encoding ratio at a Pearson's correlation coefficient of 0. 999948, and a spectral margin (input spectrum min - residual spectrum mean) of 24 dB. HPC scientists confirmed that APAX did not change HPC simulation results DRAM and disk transfers by 8x, accelerating HPC “time to results” by 20% while reducing to 50%.
更多
查看译文
关键词
corresponding increase,memory wall,hpc datasets,application acceleration,adaptive method,fundamental limitation,universal numerical encoder,profiler reduces computing,o bandwidth,soc implementations,high-performance computing,numerical encoding,numerical computation,pearson correlation coefficient,graph theory,field programmable gate arrays,hpc,multicore processing,enob,numerical analysis,multiphysics simulation,market research,uncertainty quantification,compression,system on chip,real time,high performance computing,acceleration,encoding,dram,effective number of bits,coal mine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要