Counterfactual Reasoning Using Predicted Latent Personality Dimensions for Optimizing Persuasion Outcome
Persuasive Technology Lecture Notes in Computer Science(2024)
摘要
Customizing persuasive conversations related to the outcome of interest for
specific users achieves better persuasion results. However, existing persuasive
conversation systems rely on persuasive strategies and encounter challenges in
dynamically adjusting dialogues to suit the evolving states of individual users
during interactions. This limitation restricts the system's ability to deliver
flexible or dynamic conversations and achieve suboptimal persuasion outcomes.
In this paper, we present a novel approach that tracks a user's latent
personality dimensions (LPDs) during ongoing persuasion conversation and
generates tailored counterfactual utterances based on these LPDs to optimize
the overall persuasion outcome. In particular, our proposed method leverages a
Bi-directional Generative Adversarial Network (BiCoGAN) in tandem with a
Dialogue-based Personality Prediction Regression (DPPR) model to generate
counterfactual data. This enables the system to formulate alternative
persuasive utterances that are more suited to the user. Subsequently, we
utilize the D3QN model to learn policies for optimized selection of system
utterances on counterfactual data. Experimental results we obtained from using
the PersuasionForGood dataset demonstrate the superiority of our approach over
the existing method, BiCoGAN. The cumulative rewards and Q-values produced by
our method surpass ground truth benchmarks, showcasing the efficacy of
employing counterfactual reasoning and LPDs to optimize reinforcement learning
policy in online interactions.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要