Loki: Streamlining Integration and Enrichment

semanticscholar(2020)

引用 0|浏览2
暂无评分
摘要
Data scientists frequently transform data from one form to another while cleaning, integrating, and enriching datasets. Writing such transformations, or “mapping functions" is time-consuming and often involves significant code re-use. Unfortunately, when every dataset is slightly different from the last, finding the right mapping functions to re-use can be just as difficult as starting from scratch. In this paper, we propose “Link Once and Keep It" (Loki), a system that consists of a repository of datasets and mapping functions. It uses this repository to relate new datasets to datasets it already knows about to help data scientists to quickly locate and re-use mapping functions built for previous datasets. Loki represents a first step towards building and re-using repositories of domain-specific data integration pipelines. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. HILDA’20, June 14–19, 2020, Portland, OR, USA © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-8022-5/20/06. . . $15.00 https://doi.org/10.1145/3398730.3399198
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要