Data Debugging and Exploration with Vizier

Proceedings of the 2019 International Conference on Management of Data(2019)

引用 28|浏览181
暂无评分
摘要
We present Vizier, a multi-modal data exploration and debugging tool. The system supports a wide range of operations by seamlessly integrating Python, SQL, and automated data curation and debugging methods. Using Spark as an execution backend, Vizier handles large datasets in multiple formats. Ease-of-use is attained through integration of a notebook with a spreadsheet-style interface and with visualizations that guide and support the user in the loop. In addition, native support for provenance and versioning enable collaboration and uncertainty management. In this demonstration we will illustrate the diverse features of the system using several realistic data science tasks based on real data.
更多
查看译文
关键词
data cleaning, data curation, data debugging, data integration, data on-boarding, notebooks, provenance, spreadsheets, uncertainty, workflows
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要