LIFA: Language identification from audio with LPCC-G features

Multimedia Tools and Applications(2023)

引用 0|浏览0
暂无评分
摘要
In Western countries, speech recognition-based technologies have significantly developed compared to the countries of the South Asian subcontinent like India. India is a multilingual country (22 scheduled languages) with over 1.3 Billion population of which a major percentage faces difficulty with the user interface of different technological advancements and therefore speech recognition tools are very useful. In this paper, we propose LIFA: Language Identification From Audio - a fully automated tool that can identify the spoken language (phrases/words) and invoke the language-specific recognition engine. Experiments were performed on more than 2200 hours of data from the top-11 spoken languages in India. The clips were parameterized with a novel linear predictive cepstral coefficient (LPCC)-based features, which we call LPCC-Grade (LPCC-G). The proposed feature is capable of focusing on the distribution of energy across different frequency ranges in an audio clip for better classification while avoiding high dimensionality issues. Using a random forest-based classifier, we achieved the highest accuracy of 99.01 96.37% and 92.48% were obtained for LSF and MFCC-based features.
更多
查看译文
关键词
Language identification,LPCC-G,Random forest,Indian spoken language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要