Managing Data Quality, Transformations, And Loading Using An Xml-Driven Engine

Bill Savage,Ying Qin,Michal Zmuda,David Hall,Laxminarayana Ganapathi,Sheping Li, Atif Hasan, Thomas Baker

WMSCI 2008: 12TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL I, PROCEEDINGS(2008)

引用 0|浏览0
暂无评分
摘要
Computer systems commonly require data to be loaded into a database from external sources. Incoming data must be examined to ensure they meet acceptable quality levels, are complete, and conform to required formats. Data must often be integrated from multiple sources, increasing the complexity of the process. Data validation and loading processes have been traditionally referred to as Extract-Transform-Load (ETL); alternately known as data integration. In this paper we present a new data integration system developed to assemble, validate, and load data, with Support for complex and diverse validation requirements. This reusable, customizable system offers improvements over many systems in several areas. The system is based on Extensible Mark-up Language (XML) constructs that define: data structures in use by multiple submitting sources (data dictionaries); complex validation rules; mappings to target objects; and target object structures. Java program components implement a self-contained data integration engine that can be a standalone system or embedded within a web site.
更多
查看译文
关键词
Data Dictionary,Data Integration,Data Validation,ETL,Java,XML
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要