학술논문

New algorithms for multi-class cancer diagnosis using tumor gene expression signatures
Document Type
Academic Journal
Source
Bioinformatics. Sep 01, 2003 19(14):1800-1807
Subject
Language
English
ISSN
1367-4803
Abstract
MOTIVATION: The increasing use of DNA microarray-based tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. RESULTS: New algorithms are developed for solving clustering, feature selection and classification problems of gene expression data. The clustering algorithm is based on optimization techniques and allows the calculation of clusters step-by-step. This approach allows us to find as many clusters as a data set contains with respect to some tolerance. Feature selection is crucial for a gene expression database. Our feature selection algorithm is based on calculating overlaps of different genes. The database used, contains over 16 000 genes and this number is considerably reduced by feature selection. We propose a classification algorithm where each tissue sample is considered as the center of a cluster which is a ball. The results of numerical experiments confirm that the classification algorithm in combination with the feature selection algorithm perform slightly better than the published results for multi-class classifiers based on support vector machines for this data set. AVAILABILITY: Available on request from the authors. CONTACT