Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
CoRR(2024)
摘要
Benchmarks play a crucial role in the development and analysis of
reinforcement learning (RL) algorithms. We identify that existing benchmarks
used for research into open-ended learning fall into one of two categories.
Either they are too slow for meaningful research to be performed without
enormous computational resources, like Crafter, NetHack and Minecraft, or they
are not complex enough to pose a significant challenge, like Minigrid and
Procgen. To remedy this, we first present Craftax-Classic: a ground-up rewrite
of Crafter in JAX that runs up to 250x faster than the Python-native original.
A run of PPO using 1 billion environment interactions finishes in under an hour
using only a single GPU and averages 90
more compelling challenge we present the main Craftax benchmark, a significant
extension of the Crafter mechanics with elements inspired from NetHack. Solving
Craftax requires deep exploration, long term planning and memory, as well as
continual adaptation to novel situations as more of the world is discovered. We
show that existing methods including global and episodic exploration, as well
as unsupervised environment design fail to make material progress on the
benchmark. We believe that Craftax can for the first time allow researchers to
experiment in a complex, open-ended environment with limited computational
resources.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要