학술논문

Can We Convert Genotype Sequences Into Images for Cases/Controls Classification?
Document Type
article
Source
Frontiers in Bioinformatics, Vol 2 (2022)
Subject
genotype-phenotype prediction
genetics
bioinformatics
applied machine learning
image classification
Computer applications to medicine. Medical informatics
R858-859.7
Language
English
ISSN
2673-7647
Abstract
Converting genotype sequences into images offers advantages, such as genotype data visualization, classification, and comparison of genotype sequences. This study converted genotype sequences into images, applied two-dimensional convolutional neural networks for case/control classification, and compared the results with the one-dimensional convolutional neural network. Surprisingly, the average accuracy of multiple runs of 2DCNN was 0.86, and that of 1DCNN was 0.89, yielding a difference of 0.03, which suggests that even the 2DCNN algorithm works on genotype sequences. Moreover, the results generated by the 2DCNN exhibited less variation than those generated by the 1DCNN, thereby offering greater stability. The purpose of this study is to draw the research community’s attention to explore encoding schemes for genotype data and machine learning algorithms that can be used on genotype data by changing the representation of the genotype data for case/control classification.