Controllable Tabular Data Synthesis Using Diffusion Models

Proceedings of the ACM on Management of Data(2024)

引用 0|浏览6
暂无评分
摘要
Controllable tabular data synthesis plays a crucial role in numerous applications by allowing users to generate synthetic data with specific conditions. These conditions can include synthesizing tuples with predefined attribute values or creating tuples that exhibit a particular correlation with an external table. However, existing approaches lack the flexibility to support new conditions and can be time-consuming when dealing with multiple conditions. To overcome these limitations, we propose a novel approach that leverages diffusion models to first learn an unconditional generative model. Subsequently, we introduce lightweight controllers to guide the unconditional generative model in generating synthetic data that satisfies different conditions. The primary research challenge lies in effectively supporting controllability using lightweight solutions while ensuring the realism of the synthetic data. To address this challenge, we design an unconditional diffusion model tailored specifically for tabular data. Additionally, we propose a new sampling method that enables correlation-aware controls throughout the data generation process. We conducted extensive experiments across various applications for controllable tabular data synthesis, which show that our approach outperforms the state-of-the-art methods.
更多
查看译文
关键词
controllable data synthesis,diffusion model,tabular data synthesis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要