Moonwalk: Inverse-Forward Differentiation
CoRR(2024)
摘要
Backpropagation, while effective for gradient computation, falls short in
addressing memory consumption, limiting scalability. This work explores
forward-mode gradient computation as an alternative in invertible networks,
showing its potential to reduce the memory footprint without substantial
drawbacks. We introduce a novel technique based on a vector-inverse-Jacobian
product that accelerates the computation of forward gradients while retaining
the advantages of memory reduction and preserving the fidelity of true
gradients. Our method, Moonwalk, has a time complexity linear in the depth of
the network, unlike the quadratic time complexity of naïve forward, and
empirically reduces computation time by several orders of magnitude without
allocating more memory. We further accelerate Moonwalk by combining it with
reverse-mode differentiation to achieve time complexity comparable with
backpropagation while maintaining a much smaller memory footprint. Finally, we
showcase the robustness of our method across several architecture choices.
Moonwalk is the first forward-based method to compute true gradients in
invertible networks in computation time comparable to backpropagation and using
significantly less memory.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要