학술논문

TADA: taxonomy-aware dataset aggregator.

Document Type

Article

Author

Hägglund, Emil; Andersson, Siv G E; Guy, Lionel

Source

Bioinformatics. Dec2023, Vol. 39 Issue 12, p1-4. 4p.

Subject

*GENOMICS
*BIODIVERSITY
*GENOMES
*PHYLOGENY
*WORKFLOW
*INTERNET servers

Language

ISSN

1367-4803

Abstract

Summary The profusion of sequenced genomes across the bacterial and archeal domains offers unprecedented possibilities for phylogenetic and comparative genomic analyses. In general, phylogenetic reconstruction is improved by the use of more data. However, including all available data is (i) not computationally tractable, and (ii) prone to biases, as the abundance of genomes is very unequally distributed over the biological diversity. Thus, in most cases, subsampling taxa to build a phylogeny is necessary. Currently, though, there is no available software to perform that handily. Here we present TADA, a taxonomic-aware dataset selection workflow that allows sampling across user-defined portions of the prokaryotic diversity with variable granularity, while setting constraints on genome quality and balance between branches. Availability and implementation TADA is implemented as a snakemake workflow and is freely available at https://github.com/emilhaegglund/TADA. [ABSTRACT FROM AUTHOR]

Online Access

Open Access (OUP) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송