학술논문
The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models.
Document Type
article
Author
Rozowsky, Joel; Gao, Jiahao; Borsari, Beatrice; Yang, Yucheng; Galeev, Timur; Gürsoy, Gamze; Epstein, Charles; Xiong, Kun; Xu, Jinrui; Li, Tianxiao; Liu, Jason; Yu, Keyang; Berthel, Ana; Chen, Zhanlin; Navarro, Fabio; Sun, Maxwell; Wright, James; Chang, Justin; Cameron, Christopher; Shoresh, Noam; Gaskell, Elizabeth; Drenkow, Jorg; Adrian, Jessika; Aganezov, Sergey; Aguet, François; Balderrama-Gutierrez, Gabriela; Banskota, Samridhi; Corona, Guillermo; Chee, Sora; Chhetri, Surya; Cortez Martins, Gabriel; Danyko, Cassidy; Davis, Carrie; Farid, Daniel; Farrell, Nina; Gabdank, Idan; Gofin, Yoel; Gorkin, David; Gu, Mengting; Hecht, Vivian; Hitz, Benjamin; Issner, Robbyn; Jiang, Yunzhe; Kirsche, Melanie; Kong, Xiangmeng; Lam, Bonita; Li, Shantao; Li, Bian; Li, Xiqi; Lin, Khine; Luo, Ruibang; Mackiewicz, Mark; Meng, Ran; Moore, Jill; Mudge, Jonathan; Nelson, Nicholas; Nusbaum, Chad; Popov, Ioann; Pratt, Henry; Qiu, Yunjiang; Ramakrishnan, Srividya; Raymond, Joe; Salichos, Leonidas; Scavelli, Alexandra; Schreiber, Jacob; Sedlazeck, Fritz; See, Lei; Sherman, Rachel; Shi, Xu; Shi, Minyi; Sloan, Cricket; Strattan, J; Tan, Zhen; Tanaka, Forrest; Vlasova, Anna; Wang, Jun; Werner, Jonathan; Williams, Brian; Xu, Min; Yan, Chengfei; Yu, Lu; Zaleski, Christopher; Zhang, Jing; Ardlie, Kristin; Cherry, J; Mendenhall, Eric; Noble, William; Weng, Zhiping; Levine, Morgan; Dobin, Alexander; Wold, Barbara; Mortazavi, Ali; Ren, Bing; Gillis, Jesse; Myers, Richard; Choudhary, Jyoti; Milosavljevic, Aleksandar; Schatz, Michael; Bernstein, Bradley; Guigó, Roderic
Source
Cell. 186(7)
Subject
Language
Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.