학술논문

Toward Addressing Training Data Scarcity Challenge in Emerging Radio Access Networks: A Survey and Framework
Document Type
Periodical
Source
IEEE Communications Surveys & Tutorials IEEE Commun. Surv. Tutorials Communications Surveys & Tutorials, IEEE. 25(3):1954-1990 Jan, 2023
Subject
Communication, Networking and Broadcast Technologies
Signal Processing and Analysis
Training data
Cellular networks
Signal to noise ratio
Interference
Automation
Handover
Artificial intelligence
Scarce data
training data
big data
emerging cellular networks
RAN
machine learning
synthetic data generation
interpolation
simulators
testbeds
Language
ISSN
1553-877X
2373-745X
Abstract
The future of cellular networks is contingent on artificial intelligence (AI) based automation, particularly for radio access network (RAN) operation, optimization, and troubleshooting. To achieve such zero-touch automation, a myriad of AI-based solutions are being proposed in literature to leverage AI for modeling and optimizing network behavior to achieve the zero-touch automation goal. However, to work reliably, AI based automation, requires a deluge of training data. Consequently, the success of the proposed AI solutions is limited by a fundamental challenge faced by cellular network research community: scarcity of the training data. In this paper, we present an extensive review of classic and emerging techniques to address this challenge. We first identify the common data types in RAN and their known use-cases. We then present a taxonomized survey of techniques used in literature to address training data scarcity for various data types. This is followed by a framework to address the training data scarcity. The proposed framework builds on available information and combination of techniques including interpolation, domain-knowledge based, generative adversarial neural networks, transfer learning, autoencoders, few-shot learning, simulators and testbeds. Potential new techniques to enrich scarce data in cellular networks are also proposed, such as by matrix completion theory, and domain knowledge-based techniques leveraging different types of network geometries and network parameters. In addition, an overview of state-of-the art simulators and testbeds is also presented to make readers aware of current and emerging platforms to access real data in order to overcome the data scarcity challenge. The extensive survey of training data scarcity addressing techniques combined with proposed framework to select a suitable technique for given type of data, can assist researchers and network operators in choosing the appropriate methods to overcome the data scarcity challenge in leveraging AI to radio access network automation.