The long short-term memory based on i-vector extraction for conversational speech gender identification approach

Barlian Henryranu Prasetio,Hiroki Tamura,Koichi Tanno

Artificial Life and Robotics（2020）

引用 4|浏览2

暂无评分

摘要

Stress causes a speaker’s voice characteristics to be changed. Emotional stress alters a person’s speech pattern such that it is distributed non-normally along the temporal dimension. Thus, the methods for identifying the gender of a non-stressed speaker were no longer effective in recognizing the gender of a speaker in stressful conditions. To address this issue, a new gender identification framework is proposed. We leveraged i-vector for capturing gender information on each speech segment. Then the long short-term memory dynamically handled all speech temporal context features and learned the long-term dependency from the input. We evaluated the effectiveness, in terms of accuracy and the number of iterations to saturate, of the proposed method by comparing it with the baseline methods in their respective abilities to identify the speaker’s gender from conversations with different durations. By learning the gender information encoded in long-term dependencies, our proposed method outperforms the baseline methods and is able to correctly identify the speaker’s gender in all conversation types.

查看译文

关键词

Gender recognition,MFCC,LTSM,i-Vector,Conversational speech

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要