학술논문

Integrating gene annotation with orthology inference at scale
Document Type
article
Author
Kirilenko, Bogdan MMunegowda, ChetanOsipova, EkaterinaJebb, DavidSharma, ViragBlumer, MoritzMorales, Ariadna EAhmed, Alexis-WalidKontopoulos, Dimitrios-GeorgiosHilgers, LeonLindblad-Toh, KerstinKarlsson, Elinor KHiller, MichaelAndrews, GregoryArmstrong, Joel CBianchi, MatteoBirren, Bruce WBredemeyer, Kevin RBreit, Ana MChristmas, Matthew JClawson, HiramDamas, JoanaDi Palma, FedericaDiekhans, MarkDong, Michael XEizirik, EduardoFan, KailiFanter, CorneliaFoley, Nicole MForsberg-Nilsson, KarinGarcia, Carlos JGatesy, JohnGazal, StevenGenereux, Diane PGoodman, LindaGrimshaw, JennaHalsey, Michaela KHarris, Andrew JHickey, GlennHindle, Allyson GHubley, Robert MHughes, Graham MJohnson, JeremyJuan, DavidKaplow, Irene MKeough, Kathleen CKirilenko, BogdanKoepfli, Klaus-PeterKorstian, Jennifer MKowalczyk, AmandaKozyrev, Sergey VLawler, Alyssa JLawless, ColleenLehmann, ThomasLevesque, Danielle LLewin, Harris ALi, XueLind, AbigailMackay-Smith, AvaMarinescu, Voichita DMarques-Bonet, TomasMason, Victor CMeadows, Jennifer RSMeyer, Wynn KMoore, Jill EMoreira, Lucas RMoreno-Santillan, Diana DMorrill, Kathleen MMuntané, GerardMurphy, William JNavarro, ArcadiNweeia, MartinOrtmann, SylviaOsmanski, AustinPaten, BenedictPaulat, Nicole SPfenning, Andreas RPhan, BaDoi NPollard, Katherine SPratt, Henry ERay, David AReilly, Steven KRosen, Jeb RRuf, IrinaRyan, LouiseRyder, Oliver ASabeti, Pardis CSchäffer, Daniel ESerres, AitorShapiro, BethSmit, Arian FASpringer, MarkSrinivasan, ChaitanyaSteiner, CynthiaStorer, Jessica MSullivan, Kevin AMSullivan, Patrick F
Source
Science. 380(6643)
Subject
Biological Sciences
Bioinformatics and Computational Biology
Genetics
Biotechnology
Human Genome
Generic health relevance
Animals
Female
Mice
Eutheria
Genome
Genomics
Molecular Sequence Annotation
Birds
Zoonomia Consortium‡
General Science & Technology
Language
Abstract
Annotating coding genes and inferring orthologs are two classical challenges in genomics and evolutionary biology that have traditionally been approached separately, limiting scalability. We present TOGA (Tool to infer Orthologs from Genome Alignments), a method that integrates structural gene annotation and orthology inference. TOGA implements a different paradigm to infer orthologous loci, improves ortholog detection and annotation of conserved genes compared with state-of-the-art methods, and handles even highly fragmented assemblies. TOGA scales to hundreds of genomes, which we demonstrate by applying it to 488 placental mammal and 501 bird assemblies, creating the largest comparative gene resources so far. Additionally, TOGA detects gene losses, enables selection screens, and automatically provides a superior measure of mammalian genome quality. TOGA is a powerful and scalable method to annotate and compare genes in the genomic era.