A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos
arxiv(2024)
摘要
The rapid advancement of Artificial Intelligence Generated Content (AIGC)
technology has propelled audio-driven talking head generation, gaining
considerable research attention for practical applications. However,
performance evaluation research lags behind the development of talking head
generation techniques. Existing literature relies on heuristic quantitative
metrics without human validation, hindering accurate progress assessment. To
address this gap, we collect talking head videos generated from four generative
methods and conduct controlled psychophysical experiments on visual quality,
lip-audio synchronization, and head movement naturalness. Our experiments
validate consistency between model predictions and human annotations,
identifying metrics that align better with human opinions than widely-used
measures. We believe our work will facilitate performance evaluation and model
development, providing insights into AIGC in a broader context. Code and data
will be made available at https://github.com/zwx8981/ADTH-QA.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要