Policy Improvement Using Language Feedback ModelsVictor Zhong,Dipendra Misra,Xingdi Yuan,Marc-Alexandre CôtéNeurIPS 2024(2024)引用 7|浏览46关键词instruction following,language feedback,language grounding,learning feedback model,imitation learningAI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要