Extensible Parallel Query Processing for Exploratory Geoscientific Data Mining

Data Mining and Knowledge Discovery(2001)

引用 4|浏览0
暂无评分
摘要
Exploratory data mining and analysis requires a computing environment which provides facilities for the user-friendly expression and rapid execution of “scientific queries.” In this paper, we address research issues in the parallelization of scientific queries containing complex user-defined operations. In a parallel query execution environment, parallelizing a query execution plan involves determining how input data streams to evaluators implementing logical operations can be divided to be processed by clones of the same evaluator in parallel. We introduced the concept of “relevance window” that characterizes data lineage and data partitioning opportunities available for an user-defined evaluator. In addition, we developed a query parallelization framework by extending relational parallel query optimization algorithms to allow the parallelization characteristics of user-defined evaluators to guide the process of query parallelization in an extensible query processing environment. We demonstrated the utility of our system by performing experiments mining cyclonic activity, blocking events, and the upward wave-energy propagation features from several observational and model simulation datasets.
更多
查看译文
关键词
parallel query processing,extensible user-defined operations,geoscientific data mining,cyclone,blocking events,upward wave-energy propagation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要