Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments

Antoine Dedieu,Wolfgang Lehrach,Guangyao Zhou,Dileep George,Miguel Lazaro-Gredilla

ICML 2024（2024）

Google DeepMind

Cited 2|Views27

Abstract

Despite their stellar performance on a wide range of tasks, includingin-context tasks only revealed during inference, vanilla transformers andvariants trained for next-token predictions (a) do not learn an explicit worldmodel of their environment which can be flexibly queried and (b) cannot be usedfor planning or navigation. In this paper, we consider partially observedenvironments (POEs), where an agent receives perceptually aliased observationsas it navigates, which makes path planning hard. We introduce a transformerwith (multiple) discrete bottleneck(s), TDB, whose latent codes learn acompressed representation of the history of observations and actions. Aftertraining a TDB to predict the future observation(s) given the history, weextract interpretable cognitive maps of the environment from its activebottleneck(s) indices. These maps are then paired with an external solver tosolve (constrained) path planning problems. First, we show that a TDB trainedon POEs (a) retains the near perfect predictive performance of a vanillatransformer or an LSTM while (b) solving shortest path problems exponentiallyfaster. Second, a TDB extracts interpretable representations from textdatasets, while reaching higher in-context accuracy than vanilla sequencemodels. Finally, in new POEs, a TDB (a) reaches near-perfect in-contextaccuracy, (b) learns accurate in-context cognitive maps (c) solves in-contextpath planning problems.

Translated text

Key words

Object Recognition

Bibtex

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

Summary is being generated by the instructions you defined