Image Watermarking for Machine Learning Datasets

PROCEEDINGS OF THE 2ND ACM DATA ECONOMY WORKSHOP, DEC 2023(2023)

引用 0|浏览4
暂无评分
摘要
Machine learning has received increasing attention for the last decade due to its significant success in classification problems in almost every application domain. For its success, the amount of available data for training plays a crucial role in the creation of a machine-learning model. However, the data-gathering process for machine learning algorithms is a tedious and time-consuming task. In many cases, the developers rely on publicly available datasets, which are not always of high quality. Recently, we are witnessing a data market paradigm where valuable datasets are sold. Thus, once the dataset is created or bought, protecting the dataset against illegal use or (re)sale and establishing intellectual property rights is necessary. In this paper, we investigate the question of deploying well-studied image watermarking techniques to be applied to classification algorithm datasets, without degrading the quality of the dataset. We investigate whether Singular Value Decomposition (SVD)-based techniques from image watermarking could be deployed on machine learning datasets or not. To this end, we chose the watermarking technique described in [8] and applied it to a machine-learning dataset. We provide experimental results on the robustness of the scheme. Our results show that the watermark embedding scheme provides decent imperceptibility and robustness against update, zero-out, and insertion attacks but, it is not successful against deletion attacks. We believe our work can inspire researchers who might want to consider applying well-studied image watermarking techniques to machine learning datasets.
更多
查看译文
关键词
Watermarking,machine learning,data ownership,data economy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要