Optimizing Adaptive Video Streaming with Human Feedback

CoRR(2023)

引用 0|浏览36
暂无评分
摘要
Quality of Experience~(QoE)-driven adaptive bitrate~(ABR) algorithms are typically optimized using QoE models that are based on the mean opinion score~(MOS), while such principles may not account for user heterogeneity on rating scales, resulting in unexpected behaviors. In this paper, we propose \texttt{Jade}, which leverages reinforcement learning with human feedback~(RLHF) technologies to better align the users' opinion scores. \texttt{Jade}'s rank-based QoE model considers relative values of user ratings to interpret the subjective perception of video sessions. We implement linear-based and Deep Neural Network (DNN)-based architectures for satisfying both accuracy and generalization ability. We further propose entropy-aware reinforced mechanisms for training policies with the integration of the proposed QoE models. Experimental results demonstrate that \texttt{Jade} performs favorably on conventional metrics, such as quality and stall ratio, and improves QoE by 8.09\%-38.13\% in different network conditions, emphasizing the importance of user heterogeneity in QoE modeling and the potential of combining linear-based and DNN-based models for performance improvement.
更多
查看译文
关键词
adaptive video streaming,human feedback
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要