Two-Phase Data Warehouse Optimized For Data Mining

BUSINESS INTELLIGENCE FOR THE REAL-TIME ENTERPRISES(2007)

引用 2|浏览0
暂无评分
摘要
We propose a new, heterogeneous data warehouse architecture where a first phase traditional relational OLAP warehouse coexist with a second phase data in compressed form optimized for data mining. Aggregations and metadata for the entire time frame are stored in the first phase relational database. The main advantage of the. second phase is its reduced I/O requirement that enables very high throughput processing by sequential read-only data stream algorithms. It becomes feasible to run speed optimized queries and data mining operations on the entire time frame of most granular data. The second phase also enables long term data storage and analysis using a very efficient compressed format at low storage costs even for historical data. The proposed architecture fits existing data warehouse solutions. We show the effectiveness of the two-phase data warehouse through a case study of a large web portal.
更多
查看译文
关键词
Data Mining, Data Warehouse, Data Cube, Data Mining Algorithm, Frequent Itemset Mining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要