On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection.


引用 0|浏览12
The state-of-the-art in speaker recognition, called x-vectors, has been adopted in several computational paralinguistic tasks, as they were shown to extract embeddings that could be efficiently utilized as features in the subsequent classification or regression step. Nevertheless, similarly to all neural networks, x-vectors might also prove to be sensitive to several training meta-parameters such as the number of hidden layers and neurons, or the number of training epochs. In this study we experimentally demonstrate that the performance of x-vector embeddings is also affected by the random seed of the initial weight initialization step before training. We also show that, by training an ensemble learning method by repeating x-vector DNN training, we can make the utterance-level predictions more robust, leading to notable improvements in the performance on the test set. We perform our experiments on the publicly available Dusseldorf Sleepy Language Corpus, for estimating the degree of sleepiness. Improving upon our previous results, we present the highest Spearman's correlation coefficient on this dataset that was achieved by a single method.
Human-computer interaction,Computational paralinguistics,X-vectors,Ensemble learning,Sleepiness detection
AI 理解论文
Chat Paper