Wavelet-Driven Spatiotemporal Predictive Learning: Bridging Frequency and Time Variations

Xuesong Nie,Yunfeng Yan,Siyuan Li,Cheng Tan,Xi Chen, Haoyuan Jin, Zhihang Zhu,Stan Z. Li,Donglian Qi

AAAI 2024(2024)

引用 0|浏览0
暂无评分
摘要
Spatiotemporal predictive learning is a paradigm that empowers models to learn spatial and temporal patterns by predicting future frames from past frames in an unsupervised manner. This method typically uses recurrent units to capture long-term dependencies, but these units often come with high computational costs and limited performance in real-world scenes. This paper presents an innovative Wavelet-based SpatioTemporal (WaST) framework, which extracts and adaptively controls both low and high-frequency components at image and feature levels via 3D discrete wavelet transform for faster processing while maintaining high-quality predictions. We propose a Time-Frequency Aware Translator uniquely crafted to efficiently learn short- and long-range spatiotemporal information by individually modeling spatial frequency and temporal variations. Meanwhile, we design a wavelet-domain High-Frequency Focal Loss that effectively supervises high-frequency variations. Extensive experiments across various real-world scenarios, such as driving scene prediction, traffic flow prediction, human motion capture, and weather forecasting, demonstrate that our proposed WaST achieves state-of-the-art performance over various spatiotemporal prediction methods.
更多
查看译文
关键词
CV: Video Understanding & Activity Analysis,ML: Unsupervised & Self-Supervised Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要