Dialogue Control by Pomdp Using Dialogue Data Statistics

SPOKEN DIALOGUE SYSTEMS: TECHNOLOGY AND DESIGN(2011)

引用 5|浏览11
暂无评分
摘要
Partially Observable Markov Decision Processes (POMDPs) are applied in action control to manage and support users' natural dialogue communication with conversational agents. Any agent's action must be determined, based on probabilistic methods, from noisy data through sensors in the real world. Agents must flexibly choose their actions to reach a target dialogue sequence with the users while retaining as many statistical characteristics of the data as possible. This issue can be solved by two approaches: automatically acquiring POMDP probabilities using Dynamic Bayesian Networks (DBNs)(DBNs) trained from a large amount of dialogue data and obtaining POMDP rewards from human evaluations and agent action predictive probabilities. Using the probabilities and the rewards, POMDP value iteration calculates a policy that can generate an action sequence that maximizes both the predictive distributions of actions and user evaluations. This chapter focuses on how to make the rewards from predictive distributions. Introducing rewards lets the policy generate actions whose predictive probabilities are maximized. Experimental results demonstrate that the proposed method can generate actions with high predictive probabilities while generating target action sequences.
更多
查看译文
关键词
Partially Observable Markov Decision Process (POMDP),Dialogue management,Multi-modal interaction,Dynamic Bayesian Network (DBN),Expectation-Maximization (EM) algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要