uTeBC-NLP at SemEval-2024 Task 9: Can LLMs be Lateral Thinkers?
arxiv(2024)
摘要
Inspired by human cognition, Jiang et al.(2023c) create a benchmark for
assessing LLMs' lateral thinking-thinking outside the box. Building upon this
benchmark, we investigate how different prompting methods enhance LLMs'
performance on this task to reveal their inherent power for outside-the-box
thinking ability. Through participating in SemEval-2024, task 9, Sentence
Puzzle sub-task, we explore prompt engineering methods: chain of thoughts (CoT)
and direct prompting, enhancing with informative descriptions, and employing
contextualizing prompts using a retrieval augmented generation (RAG) pipeline.
Our experiments involve three LLMs including GPT-3.5, GPT-4, and
Zephyr-7B-beta. We generate a dataset of thinking paths between riddles and
options using GPT-4, validated by humans for quality. Findings indicate that
compressed informative prompts enhance performance. Dynamic in-context learning
enhances model performance significantly. Furthermore, fine-tuning Zephyr on
our dataset enhances performance across other commonsense datasets,
underscoring the value of innovative thinking.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要