학술논문

Identifying Relevant Features for a Multi-factorial Disorder with Constraint-Based Subspace Clustering
Document Type
Conference
Source
2016 IEEE 29th International Symposium on Computer-Based Medical Systems (CBMS) Computer-Based Medical Systems (CBMS), 2016 IEEE 29th International Symposium on. :207-212 Jun, 2016
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Clustering algorithms
Merging
Prediction algorithms
Cleaning
Electronic mail
Data mining
Marine vehicles
medical data mining
patient similarity
feature selection
classification
epidemiological studies
hepatic steatosis
Language
ISSN
2372-9198
Abstract
The identification of predictive features associated with distinct medical outcomes is a key requirement for meaningful clinical decision support. Usually, their discovery is based on sets of labeled examples and an analysis of the inherent information of the features w. r. t. the target variable. However, obtaining large sets of labeled examples may be not feasible and the sole label consideration could even dilute characteristics unique to distinct subgroups. In such cases, instead of considering the value of the target variable, expert knowledge on the similarity between examples could be utilized. In this work we propose a new algorithm for the "Discovery of Relevant Example-constrained Subspaces" (DRESS) which uses constraints on the similarity between examples to discover feature sets that describe a target concept. DRESS exploits the density of clusters and the distance-behavior between constrained examples to evaluate the quality of a feature set without requiring explicit information about the target variable. We evaluate DRESS against classical feature selection methods on cohort participants for the disorder "hepatic steatosis", and report on our findings on classifier performance and identified important features.