Cache locality-centric parallel string matching on many-core accelerator chips

Periodicals(2015)

引用 2|浏览6
暂无评分
摘要
AbstractAho-Corasick (AC) algorithm is a multiple patterns string matching algorithm commonly used in computer and network security and bioinformatics, among many others. In order to meet the highly demanding computational requirements imposed on these applications, achieving high performance for the AC algorithm is crucial. In this paper, we present a high performance parallelization of the AC on the many-core accelerator chips such as the Graphic Processing Unit (GPU) from Nvidia and the Intel Xeon Phi. Our parallelization approach significantly improves the cache locality of the AC by partitioning a given set of string patterns into multiple smaller sets of patterns in a space-efficient way. Using the multiple pattern sets, intensive pattern matching operations are concurrently conducted with respect to the whole input text data. Compared with the previous approaches where the input data is partitioned amongst multiple threads instead of partitioning the pattern set, our approach significantly improves the performance. Experimental results show that our approach leads up to 2.73 times speedup on the Nvidia K20 GPU and 2.00 times speedup on the Intel Xeon Phi compared with the previous approach. Our parallel implementation delivers up to 693 Gbps throughput performance on the K20.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要