학술논문

Relating enhancer genetic variation across mammals to complex phenotypes using machine learning
Document Type
article
Author
Kaplow, Irene MLawler, Alyssa JSchäffer, Daniel ESrinivasan, ChaitanyaSestili, Heather HWirthlin, Morgan EPhan, BaDoi NPrasad, KavyaBrown, Ashley RZhang, XiaomengFoley, KathleenGenereux, Diane PKarlsson, Elinor KLindblad-Toh, KerstinMeyer, Wynn KPfenning, Andreas RAndrews, GregoryArmstrong, Joel CBianchi, MatteoBirren, Bruce WBredemeyer, Kevin RBreit, Ana MChristmas, Matthew JClawson, HiramDamas, JoanaDi Palma, FedericaDiekhans, MarkDong, Michael XEizirik, EduardoFan, KailiFanter, CorneliaFoley, Nicole MForsberg-Nilsson, KarinGarcia, Carlos JGatesy, JohnGazal, StevenGoodman, LindaGrimshaw, JennaHalsey, Michaela KHarris, Andrew JHickey, GlennHiller, MichaelHindle, Allyson GHubley, Robert MHughes, Graham MJohnson, JeremyJuan, DavidKeough, Kathleen CKirilenko, BogdanKoepfli, Klaus-PeterKorstian, Jennifer MKowalczyk, AmandaKozyrev, Sergey VLawless, ColleenLehmann, ThomasLevesque, Danielle LLewin, Harris ALi, XueLind, AbigailMackay-Smith, AvaMarinescu, Voichita DMarques-Bonet, TomasMason, Victor CMeadows, Jennifer RSMoore, Jill EMoreira, Lucas RMoreno-Santillan, Diana DMorrill, Kathleen MMuntané, GerardMurphy, William JNavarro, ArcadiNweeia, MartinOrtmann, SylviaOsmanski, AustinPaten, BenedictPaulat, Nicole SPollard, Katherine SPratt, Henry ERay, David AReilly, Steven KRosen, Jeb RRuf, IrinaRyan, LouiseRyder, Oliver ASabeti, Pardis CSerres, AitorShapiro, BethSmit, Arian FASpringer, MarkSteiner, Cynthia
Source
Science. 380(6643)
Subject
Biological Sciences
Genetics
Pediatric
Human Genome
Rare Diseases
Congenital Structural Anomalies
Underpinning research
1.1 Normal biological development and functioning
Neurological
Mental health
Animals
Enhancer Elements
Genetic
Genetic Variation
Machine Learning
Mammals
Phenotype
Zoonomia Consortium**
General Science & Technology
Language
Abstract
Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species' phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer-phenotype associations, including brain size-associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.