학술논문

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Document Type

Working Paper

Author

Perkowski, Ernest; Pan, Rui; Nguyen, Tuan Dung; Ting, Yuan-Sen; Kruk, Sandor; Zhang, Tong; O'Neill, Charlie; Jablonska, Maja; Sun, Zechang; Smith, Michael J.; Liu, Huiling; Schawinski, Kevin; Iyer, Kartheik; UniverseTBD, Ioana Ciucă for

Source

Subject

Astrophysics - Instrumentation and Methods for Astrophysics
Astrophysics - Cosmology and Nongalactic Astrophysics
Astrophysics - Astrophysics of Galaxies
Astrophysics - Solar and Stellar Astrophysics
Computer Science - Computation and Language
Computer Science - Machine Learning

Language

Abstract

We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like GPT-4 excel in broader question-answering scenarios due to superior reasoning capabilities, our findings suggest that continual pre-training with limited resources can still enhance model performance on specialized topics. Additionally, we present an extension of AstroLLaMA: the fine-tuning of the 7B LLaMA model on a domain-specific conversational dataset, culminating in the release of the chat-enabled AstroLLaMA for community use. Comprehensive quantitative benchmarking is currently in progress and will be detailed in an upcoming full paper. The model, AstroLLaMA-Chat, is now available at https://huggingface.co/universeTBD, providing the first open-source conversational AI tool tailored for the astronomy community.
Comment: 4 pages, 1 figure, model is available at https://huggingface.co/universeTBD, published in RNAAS

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송