Next-generation sequencing technologies have fostered an unprecedented proliferation of high-throughput sequencing projects and a concomitant development of novel algorithms for the assembly of short reads. In this context, an important issue is the need of a careful assessment of the accuracy of the assembly process. Here, we review the efficiency of a panel of assemblers, specifically designed to handle data from GS FLX 454 platform, on three bacterial data sets with different characteristics in terms of reads coverage and repeats content. Our aim is to investigate their strengths and weaknesses in the reconstruction of the reference genomes. In our benchmarking, we assess assemblers’ performance, quantifying and characterizing assembly gaps and errors, and evaluating their ability to solve complex genomic regions containing repeats. The final goal of this analysis is to highlight pros and cons of each method, in order to provide the final user with general criteria for the right choice of the appropriate assembly strategy, depending on the specific needs. A further aspect we have explored is the relationship between coverage of a sequencing project and quality of the obtained results.The final outcome suggests that, for a good tradeoff between costs and results, the planned genome coverage of an experiment should not exceed 20^30 x.

Finotello, F.; Lavezzo, E.; Fontana, P.; Peruzzo, D.; Albiero, A.; Barzon, L.; Falda, M.; Di Camillo, B.; Toppo, S. (2012). Comparative analysis of algorithms for whole-genome assembly of pyrosequencing data. BRIEFINGS IN BIOINFORMATICS, 13 (3): 269-280. doi: 10.1093/bib/bbr063 handle: http://hdl.handle.net/10449/20741

Comparative analysis of algorithms for whole-genome assembly of pyrosequencing data

Fontana, Paolo;
2012-01-01

Abstract

Next-generation sequencing technologies have fostered an unprecedented proliferation of high-throughput sequencing projects and a concomitant development of novel algorithms for the assembly of short reads. In this context, an important issue is the need of a careful assessment of the accuracy of the assembly process. Here, we review the efficiency of a panel of assemblers, specifically designed to handle data from GS FLX 454 platform, on three bacterial data sets with different characteristics in terms of reads coverage and repeats content. Our aim is to investigate their strengths and weaknesses in the reconstruction of the reference genomes. In our benchmarking, we assess assemblers’ performance, quantifying and characterizing assembly gaps and errors, and evaluating their ability to solve complex genomic regions containing repeats. The final goal of this analysis is to highlight pros and cons of each method, in order to provide the final user with general criteria for the right choice of the appropriate assembly strategy, depending on the specific needs. A further aspect we have explored is the relationship between coverage of a sequencing project and quality of the obtained results.The final outcome suggests that, for a good tradeoff between costs and results, the planned genome coverage of an experiment should not exceed 20^30 x.
Assembly algorithm assessment
Bacterial genome
454 pyrosequencing
Coverage
Settore BIO/11 - BIOLOGIA MOLECOLARE
2012
Finotello, F.; Lavezzo, E.; Fontana, P.; Peruzzo, D.; Albiero, A.; Barzon, L.; Falda, M.; Di Camillo, B.; Toppo, S. (2012). Comparative analysis of algorithms for whole-genome assembly of pyrosequencing data. BRIEFINGS IN BIOINFORMATICS, 13 (3): 269-280. doi: 10.1093/bib/bbr063 handle: http://hdl.handle.net/10449/20741
File in questo prodotto:
File Dimensione Formato  
2012 BiB Finotello et al.pdf

solo utenti autorizzati

Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 371.06 kB
Formato Adobe PDF
371.06 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10449/20741
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 22
  • ???jsp.display-item.citation.isi??? 19
social impact