학술논문
Identifying Relevant Features for a Multi-factorial Disorder with Constraint-Based Subspace Clustering
Document Type
Conference
Source
2016 IEEE 29th International Symposium on Computer-Based Medical Systems (CBMS) Computer-Based Medical Systems (CBMS), 2016 IEEE 29th International Symposium on. :207-212 Jun, 2016
Subject
Language
ISSN
2372-9198
Abstract
The identification of predictive features associated with distinct medical outcomes is a key requirement for meaningful clinical decision support. Usually, their discovery is based on sets of labeled examples and an analysis of the inherent information of the features w. r. t. the target variable. However, obtaining large sets of labeled examples may be not feasible and the sole label consideration could even dilute characteristics unique to distinct subgroups. In such cases, instead of considering the value of the target variable, expert knowledge on the similarity between examples could be utilized. In this work we propose a new algorithm for the "Discovery of Relevant Example-constrained Subspaces" (DRESS) which uses constraints on the similarity between examples to discover feature sets that describe a target concept. DRESS exploits the density of clusters and the distance-behavior between constrained examples to evaluate the quality of a feature set without requiring explicit information about the target variable. We evaluate DRESS against classical feature selection methods on cohort participants for the disorder "hepatic steatosis", and report on our findings on classifier performance and identified important features.