Personalized Audio Quality Preference Prediction

Chung-Che Wang, Yu-Chun Lin, Yu-Teng Hsu,Jyh-Shing Roger Jang

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC(2023)

引用 0|浏览9
暂无评分
摘要
This paper proposes to use both audio input and subject information to predict the personalized preference of two audio segments with the same content in different qualities. A siamese network is used to compare two inputs and predict the preference. Several different structures for each side of the siamese network are investigated. The baseline structure which uses only audio information involves using a pretrained audio encoder followed by fully connected layers. In several different proposed structures, the approach of concatenating subject information with audio embedding before feeding it into fully connected layers outperforms the baseline model the most, resulting in an increase in overall accuracy from 77.56% to 78.04%. Experimental results also demonstrate that utilizing the complete set of subject information, which includes age, gender, and headphone/earphone specifications such as impedance, frequency response range, and sensitivity, is more effective than using a subset of this information.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要