Statistical quality of experience analysis: on planning the sample size and statistical significance testing.

JOURNAL OF ELECTRONIC IMAGING(2018)

引用 27|浏览11
暂无评分
摘要
This paper analyzes how an experimenter can balance errors in subjective video quality tests between the statistical power of finding an effect if it is there and not claiming that an effect is there if the effect is not there, i.e., balancing Type I and Type II errors. The risk of committing Type I errors increases with the number of comparisons that are performed in statistical tests. We will show that when controlling for this and at the same time keeping the power of the experiment at a reasonably high level, it is unlikely that the number of test subjects that are normally used and recommended by the International Telecommunication Union (ITU), i.e., 15 is sufficient but the number used by the Video Quality Experts Group (VQEG), i.e., 24 is more likely to be sufficient. Examples will also be given for the influence of Type I error on the statistical significance of comparing objective metrics by correlation. We also present a comparison between parametric and nonparametric statistics. The comparison targets the question whether we would reach different conclusions on the statistical difference between the video quality ratings of different video clips in a subjective test, based on the comparison between the student T-test and the Mann-Whitney U-test. We found that there was hardly a difference when few comparisons are compensated for, i.e., then almost the same conclusions are reached. When the number of comparisons is increased, then larger and larger differences between the two methods are revealed. In these cases, the parametric T-test gives clearly more significant cases, than the nonparametric test, which makes it more important to investigate whether the assumptions are met for performing a certain test. (C) 2018 SPIE and IS&T.
更多
查看译文
关键词
Type-I error,video quality,statistical significance,quality of experience,Student T-test,Bonferroni,Mann-Whitney U-test,parametric versus nonparametric test
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要