A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos
arxiv(2024)
摘要
Eye-tracking applications that utilize the human gaze in video understanding
tasks have become increasingly important. To effectively automate the process
of video analysis based on eye-tracking data, it is important to accurately
replicate human gaze behavior. However, this task presents significant
challenges due to the inherent complexity and ambiguity of human gaze patterns.
In this work, we introduce a novel method for simulating human gaze behavior.
Our approach uses a transformer-based reinforcement learning algorithm to train
an agent that acts as a human observer, with the primary role of watching
videos and simulating human gaze behavior. We employed an eye-tracking dataset
gathered from videos generated by the VirtualHome simulator, with a primary
focus on activity recognition. Our experimental results demonstrate the
effectiveness of our gaze prediction method by highlighting its capability to
replicate human gaze behavior and its applicability for downstream tasks where
real human-gaze is used as input.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要