Improving Efficiency in Facial Recognition Tasks Through a Dataset Optimization Approach

IEEE ACCESS(2024)

引用 0|浏览0
暂无评分
摘要
It is well known that the performance of Machine Learning techniques, notably when applied to Computer Vision (CV), depends heavily on the amount and quality of the training data set. However, large data sets lead to time-consuming training loops and, in many situations, are difficult or even impossible to create. Therefore, there is a need for solutions to reduce their size while ensuring good levels of performance, i.e., solutions that obtain the best tradeoff between the amount/quality of training data and the model's performance. This paper proposes a dataset reduction approach for training data used in Deep Learning methods in Facial Recognition (FR) problems. We focus on maximizing the variability of representations for each subject (person) in the training data, thus favoring "quality" instead of size. The main research questions are: 1) Which facial features better discriminate different identities? 2) Will it be possible to significantly reduce the training time without compromising performance? 3) Should we favor "quality" over quantity for very large datasets in FR? This analysis uses a pipeline to discriminate a set of features suitable for capturing the diversity and a cluster-based sampling to select the best images for each training subject, i.e., person. Results were obtained using VGGFace2 and Labeled Faces in the Wild (for benchmarking) and show that, with the proposed approach, a data reduction is possible while ensuring similar levels of accuracy.
更多
查看译文
关键词
Datasets,feature engineering,deep learning,data reduction,facial recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要