Re-identification Robustness Over Time - The Case for Synthetic Training Data.

International Conference on Machine Learning and Applications(2023)

引用 0|浏览0
暂无评分
摘要
This study proposes the use of synthetic data to improve the robustness of training and testing of re-identification models in the context of logistics. In particular, this work considers image generation and re-identification of pallet blocks. At first, synthetic data is generated by multiple, small shifts through the latent feature space of a Generative Adversarial Network. As the shifts are performed in specialized latent feature space directions, the resulting images depict the same image identity while offering slightly modified visual features, like thickened branding patterns or different lighting. To guarantee that the generated images are appropriate for the training and testing of re-identification models, the image identity may not be destroyed through the modifications. Therefore, a threshold based on the Multi-Scale Structural Similarity Index is applied. The artificial data is then used alongside real data to train re-identification models with the goal of improving robustness while reducing the need for real data. To create common ground for pallet block re-identification, the dataset pallet-block-98382_3270 is released alongside this study. The dataset consists of 280,763 images of 98,382 real wooden Euro-pallet blocks and an additional 42,510 artificially generated images of 3,270 pallet block identities which were selected as suitable for the training and testing of re-identification models. Additionally, the dataset contains 143,000 unfiltered artificially generated images. The use of additional artificial data during training increases the mean Average Precision by up to 1.3%, indicating that real data can be replaced or supplemented by synthetic data. Even when not increasing the mean Average Precision, the use of synthetic data can be recommended as it increases the re-identification models' robustness across different testing datasets. These insights are crucial in industrial contexts, where data acquisition is costly and limited.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要