Learning Spatio-Temporal Relations with Multi-Scale Integrated Perception for Video Anomaly Detection

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2024)

引用 0|浏览0
暂无评分
摘要
In weakly supervised video anomaly detection, it has been verified that anomalies can be biased by background noise. Previous works attempted to focus on local regions to exclude irrelevant information. However, the abnormal events in different scenes vary in size, and current methods struggle to consider local events of different scales concurrently. To this end, we propose a multi-scale integrated perception (MSIP) learning approach to perceive abnormal regions of different scales simultaneously. In our method, a frame is partitioned into several groups of patches with varying scales, and a multi-scale patch spatial relation (MPSR) module is further proposed to model the inconsistencies among multi-scale patches. Specifically, we design a hierarchical graph convolution block in the MPSR module to improve the integration of patch features by implementing cross-scale feature learning. An existing clip temporal relation network is also introduced to enable spatio-temporal encoding in our model. Experiments show that our method achieves new state-of-the-art performance on the ShanghaiTech and competitive results on UCF-Crime benchmarks.
更多
查看译文
关键词
video anomaly detection,multi-scale perception,weakly supervised,spatio-temporal relation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要