학술논문

Data-driven Feature Selection for Long Longitudinal Breadth and High Dimensional Dataset : Empirical Studies of Metabolic Syndrome Prediction
Document Type
Conference
Source
Proceedings of the 2020 12th International Conference on Machine Learning and Computing. :208-212
Subject
Machine learning
dimensionality reduction
early prediction
fatty liver disease
feature selection
hypertension
Language
English
Abstract
Diversified research focusing on early prediction of diseases based on small datasets and/or reducing the execution time on computer was published. However, no convincing evidence was showed that the conventional wrapper based feature selection was applicable for processing large-scale high-dimension dataset in an efficient way. In this study, our wrapper based feature selection method is designed to apply to a large-scale dataset with high dimension and long longitudinal breadth for predicting two types of metabolic syndrome diseases. Specifically, according to the components, the method adopts dimensionality reduction and/or feature selection to optimize subsets of features. Subsequently, the selected features will be further identified with statistics provided by clinic experts to exam the significance and accuracy. Accordingly, the method provides a sufficient mean for early diagnosis of metabolic diseases and achieves the efficiency by reducing notable computational time.

Online Access