AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets
CoRR(2024)
摘要
We explore the potential of enhancing LLM performance in astronomy-focused
question-answering through targeted, continual pre-training. By employing a
compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of
astronomy corpora – comprising abstracts, introductions, and conclusions – we
achieve notable improvements in specialized topic comprehension. While general
LLMs like GPT-4 excel in broader question-answering scenarios due to superior
reasoning capabilities, our findings suggest that continual pre-training with
limited resources can still enhance model performance on specialized topics.
Additionally, we present an extension of AstroLLaMA: the fine-tuning of the 7B
LLaMA model on a domain-specific conversational dataset, culminating in the
release of the chat-enabled AstroLLaMA for community use. Comprehensive
quantitative benchmarking is currently in progress and will be detailed in an
upcoming full paper. The model, AstroLLaMA-Chat, is now available at
https://huggingface.co/universeTBD, providing the first open-source
conversational AI tool tailored for the astronomy community.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要