학술논문

A large-scale evaluation of computational protein function prediction
Document Type
article
Author
Radivojac, PredragClark, Wyatt TOron, Tal RonnenSchnoes, Alexandra MWittkop, TobiasSokolov, ArtemGraim, KileyFunk, ChristopherVerspoor, KarinBen-Hur, AsaPandey, GauravYunes, Jeffrey MTalwalkar, Ameet SRepo, SusannaSouza, Michael LPiovesan, DamianoCasadio, RitaWang, ZhengCheng, JianlinFang, HaiGough, JulianKoskinen, PatrikTörönen, PetriNokso-Koivisto, JussiHolm, LiisaCozzetto, DomenicoBuchan, Daniel WABryson, KevinJones, David TLimaye, BhaktiInamdar, HarshalDatta, AvikManjari, Sunitha KJoshi, RajendraChitale, MeghanaKihara, DaisukeLisewski, Andreas MErdin, SerkanVenner, EricLichtarge, OlivierRentzsch, RobertYang, HaixuanRomero, Alfonso EBhat, PrajwalPaccanaro, AlbertoHamp, TobiasKaßner, RebeccaSeemayer, StefanVicedo, EsmeraldaSchaefer, ChristianAchten, DominikAuer, FlorianBoehm, ArianeBraun, TatjanaHecht, MaximilianHeron, MarkHönigschmid, PeterHopf, Thomas AKaufmann, StefanieKiening, MichaelKrompass, DenisLanderer, CedricMahlich, YannickRoos, ManfredBjörne, JariSalakoski, TapioWong, AndrewShatkay, HagitGatzmann, FannySommer, IngolfWass, Mark NSternberg, Michael JEŠkunca, NivesSupek, FranBošnjak, MatkoPanov, PančeDžeroski, SašoŠmuc, TomislavKourmpetis, Yiannis AIvan Dijk, Aalt DJBraak, Cajo JF terZhou, YuanpengGong, QingtianDong, XinranTian, WeidongFalda, MarcoFontana, PaoloLavezzo, EnricoDi Camillo, BarbaraToppo, StefanoLan, LiangDjuric, NemanjaGuo, YuhongVucetic, SlobodanBairoch, AmosLinial, MichalBabbitt, Patricia CBrenner, Steven EOrengo, ChristineRost, Burkhard
Source
Nature Methods. 10(3)
Subject
Generic health relevance
Algorithms
Animals
Computational Biology
Databases
Protein
Exoribonucleases
Forecasting
Humans
Molecular Biology
Molecular Sequence Annotation
Proteins
Species Specificity
Biological Sciences
Technology
Medical and Health Sciences
Developmental Biology
Language
Abstract
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.