Improving imbalanced industrial datasets to enhance the accuracy of mechanical property prediction and process optimization for strip steel

Feifei Li,Anrui He,Yong Song, Chengzhe Shen, Fenjia Wang,Tieheng Yuan,Shiwei Zhang, Xiaoqing Xu,Yi Qiang,Chao Liu, Pengfei Liu, Qiangguo Zhao

Journal of Intelligent Manufacturing(2023)

引用 0|浏览1
暂无评分
摘要
The problem of imbalanced regression is widely prevalent in various intelligent manufacturing systems, significantly constraining the industrial application of machine learning models. Existing research has overlooked the impact of redundant data and has lost valuable information within unlabeled data, therefore, the effectiveness of the models is limited. To this end, we propose a novel model framework (sNN-ST, similarity-based nearest neighbor and Self-Training fusion) to address imbalanced regression in industrial big data. This approach comprises two main steps: first, we identify and remove redundant samples by analyzing the redundancy relationships among samples. Then, we perform pseudo-labeling on unlabeled data, selectively incorporating reliable and non-redundant samples into the labeled dataset. We validate the proposed method on two imbalanced regression datasets. Removing redundant data and effectively utilizing unlabeled data optimize the dataset's distribution and enhance its information entropy. Consequently, the processed dataset significantly improves the overall model performance. We used this model to conduct a Multi-Parameter Global Relative Sensitivity Analysis within a production system. This analysis optimized existing process parameters and improved product quality consistency. This research presents a promising approach to addressing imbalanced regression problems.
更多
查看译文
关键词
Imbalanced regression,Process optimization,Sample pruning,Self-training,Semi-supervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要