학술논문

Mixed Machine Learning Approach for Efficient Prediction of Human Heart Disease by Identifying the Numerical and Categorical Features
Document Type
article
Source
Applied Sciences, Vol 12, Iss 15, p 7449 (2022)
Subject
heart disease
mixed machine learning techniques
numerical features
categorical features
RFC
DT
Technology
Engineering (General). Civil engineering (General)
TA1-2040
Biology (General)
QH301-705.5
Physics
QC1-999
Chemistry
QD1-999
Language
English
ISSN
12157449
2076-3417
Abstract
Heart disease is a danger to people’s health because of its prevalence and high mortality risk. Predicting cardiac disease early using a few simple physical indications collected from a routine physical examination has become difficult. Clinically, it is critical and sensitive for the signs of heart disease for accurate forecasts and concrete steps for future diagnosis. The manual analysis and prediction of a massive volume of data are challenging and time-consuming. In this paper, a unique heart disease prediction model is proposed to predict heart disease correctly and rapidly using a variety of bodily signs. A heart disease prediction algorithm based on the analysis of the predictive models’ classification performance on combined datasets and the train-test split technique is presented. Finally, the proposed technique’s training results are compared with the previous works. For the Cleveland, Switzerland, Hungarian, and Long Beach VA heart disease datasets, accuracy, precision, recall, F1-score, and ROC-AUC curves are used as the performance indicators. The analytical outcomes for Random Forest Classifiers (RFC) of the combined heart disease datasets are F1-score 100%, accuracy 100%, precision 100%, recall 100%, and the ROC-AUC 100%. The Decision Tree Classifiers for pooled heart disease datasets are F1-score 100%, accuracy 98.80%, precision 98%, recall 99%, ROC-AUC 99%, and for RFC and Gradient Boosting Classifiers (GBC), the ROC-AUC gives 100% performance. The performances of the machine learning algorithms are improved by using five-fold cross validation. Again, the Stacking CV Classifier is also used to improve the performances of the individual machine learning algorithms by combining two and three techniques together. In this paper, several reduction methods are incorporated. It is found that the accuracy of the RFC classification algorithm is high. Moreover, the developed method is efficient and reliable for predicting heart disease.