Self-Supervised Behavior Cloned Transformers are Path Crawlers for Text Games.
Conference on Empirical Methods in Natural Language Processing(2023)
摘要
In this work, we introduce a self-supervised behavior cloning transformer for
text games, which are challenging benchmarks for multi-step reasoning in
virtual environments. Traditionally, Behavior Cloning Transformers excel in
such tasks but rely on supervised training data. Our approach auto-generates
training data by exploring trajectories (defined by common macro-action
sequences) that lead to reward within the games, while determining the
generality and utility of these trajectories by rapidly training small models
then evaluating their performance on unseen development games. Through
empirical analysis, we show our method consistently uncovers generalizable
training data, achieving about 90\% performance of supervised systems across
three benchmark text games.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要