학술논문

Stable and Accurate Feature Selection from Microarray Data with Ensembled Fast Correlation Based Filter
Document Type
Conference
Source
2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) Bioinformatics and Biomedicine (BIBM), 2020 IEEE International Conference on. :2996-2998 Dec, 2020
Subject
Bioengineering
Computing and Processing
Signal Processing and Analysis
Feature extraction
Stability criteria
Correlation
Training
Biology
Information filters
Cancer
microarray data
feature selection
ensemble learning
stability
classification
Language
Abstract
Feature selection has been playing an important role in analyzing the high-dimension and low-sample-size gene expression profiles towards high classification performance of diseases and deep understanding of the underlying biological mechanisms. Besides classification performance, the stability of selected features is another non-ignorable factor in evaluating a feature selector, since stable feature selection results enhance the confidence of selected features for true biomarker discovery and further biological validation. In this study, we propose a novel feature selection method under the ensemble learning framework. Specifically, we take Fast Correlation Based Filter as the base feature selector to analyze subsamples of microarray data. We then present several aggregation methods to combine multiple feature subsets. Finally, two stability measures are used to quantify the robustness of feature selectors to data variations. Our comparative empirical study on publicly available datasets demonstrates the superiority of the proposed methods over its competitors in obtaining high stability scores and classification accuracy.