학술논문

ProFAB—open protein functional annotation benchmark.
Document Type
Article
Source
Briefings in Bioinformatics. Mar2023, Vol. 24 Issue 2, p1-13. 13p.
Subject
*MACHINE learning
*BIOLOGICAL databases
*AMINO acid sequence
*PROTEINS
*ANNOTATIONS
Language
ISSN
1467-5463
Abstract
As the number of protein sequences increases in biological databases, computational methods are required to provide accurate functional annotation with high coverage. Although several machine learning methods have been proposed for this purpose, there are still two main issues: (i) construction of reliable positive and negative training and validation datasets, and (ii) fair evaluation of their performances based on predefined experimental settings. To address these issues, we have developed ProFAB: Open Protein Functional Annotation Benchmark, which is a platform providing an infrastructure for a fair comparison of protein function prediction methods. ProFAB provides filtered and preprocessed protein annotation datasets and enables the training and evaluation of function prediction methods via several options. We believe that ProFAB will be useful for both computational and experimental researchers by enabling the utilization of ready-to-use datasets and machine learning algorithms for protein function prediction based on Gene Ontology terms and Enzyme Commission numbers. ProFAB is available at https://github.com/kansil/ProFAB and https://profab.kansil.org. [ABSTRACT FROM AUTHOR]