Effect of Spectrogram Preprocessing and Enhancement on Speaker Recognition Performance

2023 International Conference on Science, Engineering and Business for Sustainable Development Goals (SEB-SDG)(2023)

引用 0|浏览0
暂无评分
摘要
Preprocessing in biometrics is the process of fine-tuning input data or features extracted from data by applying varying techniques to improve recognition performance. In most speaker recognition literature, researchers are silent about how the decision for such parameters used was chosen. This work systematically arrived at such decisions by carrying out several experiments on preprocessing and enhancement of spectrograms to see the effect on speaker recognition performance using a Convolutional Neural Network. First, different preprocessing experiments were carried out, one preprocessing parameter, in turn, was varied while the others were kept constant and eventually, the best parameters for all the preprocessing methods were combined to produce a spectrogram that yielded the best accuracy. The second part consists of enhancement experiments, where a series of image improvement techniques were applied to the spectrograms to further improve the initial accuracy. With the right parametric combination of the preprocessing techniques, speaker recognition improved by 25%. Spectrogram enhancements improved performance by a further 1%. Experimental results revealed that the dimensionality of a spectrogram can be significantly reduced with a very negligible drop in the overall performance which can enhance storage and computational requirements.
更多
查看译文
关键词
convolutional neural network,preprocessing,post-processing,speaker recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要