Reinforcement Learning with Dynamic Multi-Reward Weighting for Multi-Style Controllable Generation
CoRR(2024)
摘要
Style is an integral component of text that expresses a diverse set of
information, including interpersonal dynamics (e.g. formality) and the author's
emotions or attitudes (e.g. disgust). Humans often employ multiple styles
simultaneously. An open question is how large language models can be explicitly
controlled so that they weave together target styles when generating text: for
example, to produce text that is both negative and non-toxic. Previous work
investigates the controlled generation of a single style, or else controlled
generation of a style and other attributes. In this paper, we expand this into
controlling multiple styles simultaneously. Specifically, we investigate
various formulations of multiple style rewards for a reinforcement learning
(RL) approach to controlled multi-style generation. These reward formulations
include calibrated outputs from discriminators and dynamic weighting by
discriminator gradient magnitudes. We find that dynamic weighting generally
outperforms static weighting approaches, and we explore its effectiveness in 2-
and 3-style control, even compared to strong baselines like plug-and-play
model. All code and data for RL pipelines with multiple style attributes will
be publicly available.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要