HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system

TACO(2013)

引用 66|浏览14
暂无评分
摘要
Data mining is a new field of computer science with a wide range of applications. Its goal is to extract knowledge from massive datasets in a human-understandable structure, for example, the decision trees. In this article we present an innovative, high-performance, system-level architecture for the Classification And Regression Tree (CART) algorithm, one of the most important and widely used algorithms in the data mining area. Our proposed architecture exploits parallelism at the decision variable level, and was fully implemented and evaluated on a modern high-performance reconfigurable platform, the Convey HC-1 server, that features four FPGAs and a multicore processor. Our FPGA-based implementation was integrated with the widely used “rpart” software library of the R project in order to provide the first fully functional reconfigurable system that can handle real-world large databases. The proposed system, named HC-CART system, achieves a performance speedup of up to two orders of magnitude compared to well-known single-threaded data mining software platforms, such as WEKA and the R platform. It also outperforms similar hardware systems which implement parts of the complete application by an order of magnitude. Finally, we show that the HC-CART system offers higher performance speedup than some other proposed parallel software implementations of decision tree construction algorithms.
更多
查看译文
关键词
similar hardware system,data mining,decision tree construction algorithm,regression tree,data mining area,multi-fpga system,functional reconfigurable system,hc-cart system,data mining classification,proposed system,decision tree,decision variable level,parallel system implementation,proposed architecture,high performance computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要