
Abordagem probabil\'istica para an\'alise de confiabilidade de dados gerados em sequenciamentos multiplex na plataforma ABI SOLiD
Document Type
Working Paper
Quantitative Biology - Genomics
Computer Science - Computational Engineering, Finance, and Science
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer provides a mixture of all samples in a single output. This process must be secure to avoid any harm that may scramble further analysis. In this context, realized the need to develop a probabilistic model capable of assigning a degree of confidence in the marking system used in multiplex sequencing. The results confirmed the adequacy of the model obtained, which allows, among other things, to guide a process of filtering the data and evaluation of the sequencing protocol used.
Comment: 8 pages, 4 figures, 2 tables, Published in Portuguese in the Anais of the XLIII Simp\'osio Brasileiro de Pesquisa Operacional (SBPO 2011), 2011. URL: http://www.din.uem.br/sbpo/sbpo2011/pdf/87903.pdf