Advancing Video Synchronization with Fractional Frame Analysis: Introducing a Novel Dataset and Model

Yuxuan Liu,Haizhou Ai,Junliang Xing, Xuri Li, Xiaoyi Wang,Pin Tao

AAAI 2024(2024)

引用 0|浏览9
暂无评分
摘要
Multiple views play a vital role in 3D pose estimation tasks. Ideally, multi-view 3D pose estimation tasks should directly utilize naturally collected videos for pose estimation. However, due to the constraints of video synchronization, existing methods often use expensive hardware devices to synchronize the initiation of cameras, which restricts most 3D pose collection scenarios to indoor settings. Some recent works learn deep neural networks to align desynchronized datasets derived from synchronized cameras and can only produce frame-level accuracy. For fractional frame video synchronization, this work proposes an Inter-Frame and Intra-Frame Desynchronized Dataset (IFID), which labels fractional time intervals between two video clips. IFID is the first dataset that annotates inter-frame and intra-frame intervals, with a total of 382,500 video clips annotated, making it the largest dataset to date. We also develop a novel model based on the Transformer architecture, named InSynFormer, for synchronizing inter-frame and intra-frame. Extensive experimental evaluations demonstrate its promising performance. The dataset and source code of the model are available at https://github.com/yuxuan-cser/InSynFormer.
更多
查看译文
关键词
CV: Video Understanding & Activity Analysis,CV: 3D Computer Vision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要