학술논문

A New Multi-Class Rebalancing Framework for Imbalance Medical Data
Document Type
Periodical
Source
IEEE Access Access, IEEE. 11:92857-92874 2023
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Medical diagnostic imaging
Feature extraction
Clustering algorithms
Computational efficiency
Additives
Tumors
Training data
Medical information systems
Classification algorithms
Predictive analytics
Imbalanced data
medical data
rebalancing framework
multi-class
classification prediction
Language
ISSN
2169-3536
Abstract
Class imbalance exists in many data domains, posing numerous challenges to the data research community. Medical datasets, in most cases, are predominantly imbalanced in nature. Through tackling multi-class issues, most researchers preferred the conventional method of decomposing it into binary classes for a more convenient solution. This method is not applicable for solving sensitive and crucial domains, such as medical data. Classifying medical datasets require all the classes to retain their form and maintain clinical validity. In this article, we develop a rebalancing framework for the multi-classification of imbalanced medical data using SCUT (SMOTE and Cluster-based Undersampling Technique) to rebalance the imbalanced class distribution, a feature selection method using a combination of SHapley Additive exPlanations (SHAP) and Recursive Feature Elimination (RFE), and DES-MI (Dynamic Ensemble Selection for multi-class) for improved multi classification performance. Two novelties contribute to the performance of our framework: improvised SCUT by implementing two clustering algorithms, and our proposed pool classifier selection for DES-MI. The performance of the proposed framework was compared with other state-of-the-art imbalanced frameworks using eight imbalanced datasets, each with varying degrees of imbalance. The experimental results indicate that our proposed framework performed better with average performance of 81.77%, 73.57%, and 75.87% in terms of Macro Average accuracy, extended G-mean, and Macro Average AUC, respectively. Our framework drastically increases the overall performance, owing to its ability to significantly handles the multi-class imbalance problem.