학술논문

MLDEG: A Machine Learning Approach to Identify Differentially Expressed Genes Using Network Property and Network Propagation
Document Type
Periodical
Source
IEEE/ACM Transactions on Computational Biology and Bioinformatics IEEE/ACM Trans. Comput. Biol. and Bioinf. Computational Biology and Bioinformatics, IEEE/ACM Transactions on. 19(4):2356-2364 Aug, 2022
Subject
Bioengineering
Computing and Processing
Gene expression
Feature extraction
Tools
Machine learning
Correlation
Training data
Proteins
Differentially expressed gene
machine learning
network property
network propagation
Language
ISSN
1545-5963
1557-9964
2374-0043
Abstract
Motivation: Identifying differentially expressed genes (DEGs) in transcriptome data is a very important task. However, performances of existing DEG methods vary significantly for data sets measured in different conditions and no single statistical or machine learning model for DEG detection perform consistently well for data sets of different traits. In addition, setting a cutoff value for the significance of differential expressions is one of confounding factors to determine DEGs. Results: We address these problems by developing an ensemble model that refines the heterogeneous and inconsistent results of the existing methods by taking accounts into network information such as network propagation and network property. DEG candidates that are predicted with weak evidence by the existing tools are re-classified by our proposed ensemble model for the transcriptome data. Tested on 10 RNA-seq datasets downloaded from gene expression omnibus (GEO), our method showed excellent performance of winning the first place in detecting ground truth (GT) genes in eight datasets and find almost all GT genes in six datasets. On the other hand, performances of all existing methods varied significantly for the 10 data sets. Because of the design principle, our method can accommodate any new DEG methods naturally. Availability: The source code of our method is available at https://github.com/jihmoon/MLDEG.