Monotonic Paraphrasing Improves Generalization of Language Model Prompting
CoRR(2024)
Abstract
Performance of large language models (LLMs) may vary with different prompts
or instructions of even the same task. One commonly recognized factor for this
phenomenon is the model's familiarity with the given prompt or instruction,
which is typically estimated by its perplexity. However, finding the prompt
with the lowest perplexity is challenging, given the enormous space of possible
prompting phrases. In this paper, we propose monotonic paraphrasing (MonoPara),
an end-to-end decoding strategy that paraphrases given prompts or instructions
into their lower perplexity counterparts based on an ensemble of a paraphrase
LM for prompt (or instruction) rewriting, and a target LM (i.e. the prompt or
instruction executor) that constrains the generation for lower perplexity. The
ensemble decoding process can efficiently paraphrase the original prompt
without altering its semantic meaning, while monotonically decreasing the
perplexity of each generation as calculated by the target LM. We explore in
detail both greedy and search-based decoding as two alternative decoding
schemes of MonoPara. Notably, MonoPara does not require any training and can
monotonically lower the perplexity of the paraphrased prompt or instruction,
leading to improved performance of zero-shot LM prompting as evaluated on a
wide selection of tasks. In addition, MonoPara is also shown to effectively
improve LMs' generalization on perturbed and unseen task instructions.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined