Data Debugging and Exploration with Vizier
Proceedings of the 2019 International Conference on Management of Data(2019)
摘要
We present Vizier, a multi-modal data exploration and debugging tool. The system supports a wide range of operations by seamlessly integrating Python, SQL, and automated data curation and debugging methods. Using Spark as an execution backend, Vizier handles large datasets in multiple formats. Ease-of-use is attained through integration of a notebook with a spreadsheet-style interface and with visualizations that guide and support the user in the loop. In addition, native support for provenance and versioning enable collaboration and uncertainty management. In this demonstration we will illustrate the diverse features of the system using several realistic data science tasks based on real data.
更多查看译文
关键词
data cleaning, data curation, data debugging, data integration, data on-boarding, notebooks, provenance, spreadsheets, uncertainty, workflows
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要