학술논문

A Hybrid Future for AI: The drive for efficiency brings large language models out of the cloud.
Document Type
Article
Source
Communications of the ACM. Oct2024, Vol. 67 Issue 10, p15-17. 3p.
Subject
*Artificial intelligence
*Cost
*Mathematical optimization
*Peer-to-peer architecture (Computer networks)
Language models
Language
ISSN
0001-0782
Abstract
The article explores how the rapid growth of large language models (LLMs) has led to significant increases in computing demands and operational costs, particularly for cloud-based AI services. To mitigate these challenges, researchers and industry leaders are exploring optimization techniques, such as model pruning, quantization, and knowledge distillation, which aim to reduce model size and enhance efficiency without sacrificing accuracy. Additionally, hybrid approaches that distribute workloads between user devices and cloud servers, as well as peer-to-peer computing models, are being investigated to improve performance and reduce the energy and financial costs of running LLMs. The ongoing research reflects a strong industry focus on overcoming the resource constraints and cost spiral associated with AI advancements.