학술논문

Snagger: A user-friendly program for incorporating additional information for tagSNP selection.
Document Type
Article
Source
BMC Bioinformatics. 2008, Vol. 9, Special section p1-13. 13p. 4 Color Photographs, 1 Diagram, 1 Chart, 1 Graph.
Subject
*OPEN source software
*COMPUTER software
*GENETIC polymorphisms
*GENOMICS
*ALGORITHMS
Language
ISSN
1471-2105
Abstract
Background: There has been considerable effort focused on developing efficient programs for tagging single-nucleotide polymorphisms (SNPs). Many of these programs do not account for potential reduced genomic coverage resulting from genotyping failures nor do they preferentially select SNPs based on functionality, which may be more likely to be biologically important. Results: We have developed a user-friendly and efficient software program, Snagger, as an extension to the existing open-source software, Haploview, which uses pairwise r² linkage disequilibrium between single nucleotide polymorphisms (SNPs) to select tagSNPs. Snagger distinguishes itself from existing SNP selection algorithms, including Tagger, by providing user options that allow for: (1) prioritization of tagSNPs based on certain characteristics, including platform-specific design scores, functionality (i.e., coding status), and chromosomal position, (2) efficient selection of SNPs across multiple populations, (3) selection of tagSNPs outside defined genomic regions to improve coverage and genotyping success, and (4) picking of surrogate tagSNPs that serve as backups for tagSNPs whose failure would result in a significant loss of data. Using HapMap genotype data from ten ENCODE regions and design scores for the Illumina platform, we show similar coverage and design score distribution and fewer total tagSNPs selected by Snagger compared to the web server Tagger. Conclusion: Snagger improves upon current available tagSNP software packages by providing a means for researchers to select tagSNPs that reliably capture genetic variation across multiple populations while accounting for significant genotyping failure risk and prioritizing on SNP-specific characteristics. [ABSTRACT FROM AUTHOR]