Improving Dialogue Agents by Decomposing One Global Explicit Annotation with Local Implicit Multimodal Feedback
arxiv(2024)
摘要
We describe an approach for aligning an LLM-based dialogue agent based on
global (i.e., dialogue-level) rewards, while also taking into account
naturally-occurring multimodal signals. At a high level, our approach (dubbed
GELI) learns a local, turn-level reward model by decomposing the human-provided
Global Explicit (GE) session-level reward, using Local Implicit (LI} multimodal
reward signals to crossmodally shape the reward decomposition step. This
decomposed reward model is then used as part of the standard RHLF pipeline
improve an LLM-based dialog agent. We run quantitative and qualitative human
studies to evaluate the performance of our GELI approach, and find that it
shows consistent improvements across various conversational metrics compared to
baseline methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要