Video synopsis method based on interaction determination using human pose estimation

Chenwu Wang, Kaixuan Yang,Pei Wang,Zhixiang Zhu,Lei Zhang, Fudan Wang

JOURNAL OF ELECTRONIC IMAGING(2024)

引用 0|浏览0
暂无评分
摘要
Video synopsis aims to condense long videos containing temporal redundancy into a more compact form by rearranging the active objects in the video. However, since each object in the video belongs to a different tube, moving the objects during the tube rearrangement process may disrupt the interaction relationship between the objects. To address this issue, we proposed a video synopsis optimization method based on interaction determination using human pose estimation. First, the inverse perspective mapping method is used to calculate the distance between objects from a bird's-eye view. Then the body orientation is determined using the coordinates of human skeletal points obtained from the human pose estimation model. These two pieces of information are then combined to determine the interactions between objects. Second, the interaction energy cost term is introduced. By penalizing the cost term, the interactions between objects in the original video are preserved in the synopsis video. Finally, we propose a fusion optimization algorithm (FSOSA) that combines snake optimization and simulated annealing. FSOSA aims to minimize the energy cost function and achieve optimal tube rearrangement. Experimental results demonstrate that the proposed method effectively preserves the interactions between objects while improving the convergence speed.
更多
查看译文
关键词
video synopsis,interaction,tube rearrangement,human pose estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要