Linear Dynamics-embedded Neural Network for Long-Sequence Modeling
CoRR(2024)
摘要
The trade-off between performance and computational efficiency in
long-sequence modeling becomes a bottleneck for existing models. Inspired by
the continuous state space models (SSMs) with multi-input and multi-output in
control theory, we propose a new neural network called Linear Dynamics-embedded
Neural Network (LDNN). SSMs' continuous, discrete, and convolutional properties
enable LDNN to have few parameters, flexible inference, and efficient training
in long-sequence tasks. Two efficient strategies, diagonalization and
'Disentanglement then Fast Fourier Transform (FFT)', are developed to
reduce the time complexity of convolution from O(LNHmax{L, N}) to
O(LNmax{H, log L}). We further improve LDNN through bidirectional
noncausal and multi-head settings to accommodate a broader range of
applications. Extensive experiments on the Long Range Arena (LRA) demonstrate
the effectiveness and state-of-the-art performance of LDNN.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要