DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360 Images
CoRR(2024)
摘要
We present DiffGaze, a novel method for generating realistic and diverse
continuous human gaze sequences on 360 images based on a conditional
score-based denoising diffusion model. Generating human gaze on 360
images is important for various human-computer interaction and computer
graphics applications, e.g. for creating large-scale eye tracking datasets or
for realistic animation of virtual humans. However, existing methods are
limited to predicting discrete fixation sequences or aggregated saliency maps,
thereby neglecting crucial parts of natural gaze behaviour. Our method uses
features extracted from 360 images as condition and uses two transformers
to model the temporal and spatial dependencies of continuous human gaze. We
evaluate DiffGaze on two 360 image benchmarks for gaze sequence
generation as well as scanpath prediction and saliency prediction. Our
evaluations show that DiffGaze outperforms state-of-the-art methods on all
tasks on both benchmarks. We also report a 21-participant user study showing
that our method generates gaze sequences that are indistinguishable from real
human sequences.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要