Radoop: Analyzing Big Data with RapidMiner and Hadoop

semanticscholar(2011)

引用 7|浏览0
暂无评分
摘要
Working with large data sets is increasingly common in research and industry. There are some distributed data analytics solutions like Hadoop, that offer high scalability and fault-tolerance, but they usually lack a user interface and only developers can exploit their functionalities. In this paper, we present Radoop, an extension for the RapidMiner data mining tool which provides easy-to-use operators for running distributed processes on Hadoop. We describe integration and development details and provide runtime measurements for several data transformation tasks. We conclude that Radoop is an excellent tool for big data analytics and scales well with increasing data set size and the number of nodes in the cluster.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要