Reversing statistics for scalable test databases generation.

SIGMOD/PODS'13: International Conference on Management of Data New York New York June, 2013(2013)

引用 23|浏览17
暂无评分
摘要
Testing the performance of database systems is commonly accomplished using synthetic data and workload generators such as TPC-H and TPC-DS. Customer data and workloads are hard to obtain due to their sensitive nature and prohibitively large sizes. As a result, oftentimes the data management systems are not properly tested before releasing, and performance-related bugs are commonly discovered after deployment, when the cost of fixing is very high. In this paper we propose RSGen, an approach to generating datasets out of customer metadata information, including schema, integrity constraints and statistics. RSGen enables generation of data that closely matches the customer environment, and is fast, scalable and extensible.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要