학술논문

Critical Assessment of Metagenome Interpretation: the second round of challenges
Document Type
article
Author
Meyer, FernandoFritz, AdrianDeng, Zhi-LuoKoslicki, DavidLesker, Till RobinGurevich, AlexeyRobertson, GaryAlser, MohammedAntipov, DmitryBeghini, FrancescoBertrand, DenisBrito, Jaqueline JBrown, C TitusBuchmann, JanBuluç, AydinChen, BoChikhi, RayanClausen, Philip TLCCristian, AlexandruDabrowski, Piotr WojciechDarling, Aaron EEgan, RobEskin, EleazarGeorganas, EvangelosGoltsman, EugeneGray, Melissa AHansen, Lars HestbjergHofmeyr, StevenHuang, PingqinIrber, LuizJia, HuijueJørgensen, Tue SparholtKieser, Silas DKlemetsen, TerjeKola, AxelKolmogorov, MikhailKorobeynikov, AntonKwan, JasonLaPierre, NathanLemaitre, ClaireLi, ChenhaoLimasset, AntoineMalcher-Miranda, FabioMangul, SergheiMarcelino, Vanessa RMarchet, CamilleMarijon, PierreMeleshko, DmitryMende, Daniel RMilanese, AlessioNagarajan, NiranjanNissen, JakobNurk, SergeyOliker, LeonidPaoli, LucasPeterlongo, PierrePiro, Vitor CPorter, Jacob SRasmussen, SimonRees, Evan RReinert, KnutRenard, BernhardRobertsen, Espen MikalRosen, Gail LRuscheweyh, Hans-JoachimSarwal, VaruniSegata, NicolaSeiler, EnricoShi, LizhenSun, FengzhuSunagawa, ShinichiSørensen, Søren JohannesThomas, AshleighTong, ChengxuanTrajkovski, MirkoTremblay, JulienUritskiy, GhermanVicedomini, RiccardoWang, ZhengyangWang, ZiyeWang, ZhongWarren, AndrewWillassen, Nils PederYelick, KatherineYou, RonghuiZeller, GeorgZhao, ZhengqiaoZhu, ShanfengZhu, JieGarrido-Oter, RubenGastmeier, PetraHacquard, StephaneHäußler, SusanneKhaledi, ArianeMaechler, FriederikeMesny, FantinRadutoiu, SimonaSchulze-Lefert, PaulSmit, NathianaStrowig, Till
Source
Nature Methods. 19(4)
Subject
Biological Sciences
Bioinformatics and Computational Biology
Infection
Archaea
Metagenome
Metagenomics
Reproducibility of Results
Sequence Analysis
DNA
Software
Technology
Medical and Health Sciences
Developmental Biology
Biological sciences
Language
Abstract
Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.