학술논문

AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Document Type

Working Paper

Source

Subject

Astrophysics - Instrumentation and Methods for Astrophysics
Astrophysics - Cosmology and Nongalactic Astrophysics
Astrophysics - Astrophysics of Galaxies
Astrophysics - High Energy Astrophysical Phenomena
Computer Science - Computation and Language
Computer Science - Machine Learning

Language

Abstract

Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.
Comment: 6 pages, 3 figures, submitted to IJCNLP-AACL 2023. Comments are welcome. The model can be found on Hugging Face - https://huggingface.co/universeTBD/astrollama

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송