AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving
arxiv(2024)
摘要
As an essential task in autonomous driving (AD), motion prediction aims to
predict the future states of surround objects for navigation. One natural
solution is to estimate the position of other agents in a step-by-step manner
where each predicted time-step is conditioned on both observed time-steps and
previously predicted time-steps, i.e., autoregressive prediction. Pioneering
works like SocialLSTM and MFP design their decoders based on this intuition.
However, almost all state-of-the-art works assume that all predicted time-steps
are independent conditioned on observed time-steps, where they use a single
linear layer to generate positions of all time-steps simultaneously. They
dominate most motion prediction leaderboards due to the simplicity of training
MLPs compared to autoregressive networks.
In this paper, we introduce the GPT style next token prediction into motion
forecasting. In this way, the input and output could be represented in a
unified space and thus the autoregressive prediction becomes more feasible.
However, different from language data which is composed of homogeneous units
-words, the elements in the driving scene could have complex spatial-temporal
and semantic relations. To this end, we propose to adopt three factorized
attention modules with different neighbors for information aggregation and
different position encoding styles to capture their relations, e.g., encoding
the transformation between coordinate systems for spatial relativity while
adopting RoPE for temporal relativity. Empirically, by equipping with the
aforementioned tailored designs, the proposed method achieves state-of-the-art
performance in the Waymo Open Motion and Waymo Interaction datasets. Notably,
AMP outperforms other recent autoregressive motion prediction methods: MotionLM
and StateTransformer, which demonstrates the effectiveness of the proposed
designs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要