학술논문

Understanding Structure Of LLM using Neural Cluster Knockout

Document Type

Conference

Author

Source

2024 5th International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) ICICV Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), 2024 5th International Conference on. :253-259 Mar, 2024

Subject

Computing and Processing
Neuroscience
Systematics
Neurons
Neural networks
Lesions
Reliability
Artificial intelligence
Large Language Model (LLM)
Neural Cluster Knockout
Generative Artificial Intelligence (Gen AI)
Increasing Accuracy and Efficiency in Generative Artificial Intelligence
Artificial Intelligence and Neuroscience Intersection
Large Language Model Optimization

Language

Abstract

This research work presents a groundbreaking approach at the intersection of neuroscience and generative Artificial Intelligence (AI), focusing on the application of neuroscience techniques to neural networks, specifically Large Language Models (LLMs). Central to this study is the concept of ‘neural cluster knockout’ in LLMs, a method inspired by lesion studies in neuroscience involving the systematic removal of neuron clusters to decipher their role within the model. The research underscores the opaque nature of neural networks, particularly LLMs, which are often critiqued for their ‘black box’ operation. By adopting neuroscience principles, particularly lesion studies, this paper aims to illuminate the inner workings of neural networks, enhancing our understanding of their functionalities. This is crucial in an era increasingly reliant on AI in various sectors, where insights from this study could lead to the development of more efficient, transparent, and accountable AI systems. Methodologically, this study involved Principal Component Analysis (PCA) and neural cluster knockout through iterative zeroing, applied to the Large Language Model named LLaMA. This approach enabled the identification of significant neuron clusters and their functional impacts when deactivated. The results reveal both critical and redundant neurons within LLMs, demonstrating that some clusters are vital for accuracy, while others may impede efficiency or contribute to errors. This research contributes significantly to the AI field, offering a novel perspective on the intricate architecture of LLMs. It lays a foundation for future advancements in AI, envisioning refined and efficient LLMs capable of more accurate and reliable performance.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송