The introduction of high throughput sequencing technologies has triggered an increase of the number of studies in which the microbiota of environmental and human samples is characterized through the sequencing of selected marker genes. While experimental protocols have undergone a process of standardization that makes them accessible to a large community of scientist, standard and robust data analysis pipelines are still lacking. Here we introduce MICCA, a software pipeline for the processing of amplicon metagenomic datasets that efficiently combines quality filtering, clustering of Operational Taxonomic Units (OTUs), taxonomy assignment and phylogenetic tree inference. MICCA provides accurate results reaching a good compromise among modularity and usability. Moreover, we introduce a de-novo clustering algorithm specifically designed for the inference of Operational Taxonomic Units (OTUs). Tests on real and synthetic datasets shows that thanks to the optimized reads filtering process and to the new clustering algorithm, MICCA provides estimates of the number of OTUs and of other common ecological indices that are more accurate and robust than currently available pipelines. Analysis of public metagenomic datasets shows that the higher consistency of results improves our understanding of the structure of environmental and human associated microbial communities. MICCA is an open source project

Albanese, D.; Fontana, P.; De Filippo, C.; Cavalieri, D.; Donati, C. (2015). MICCA: a complete and accurate software for taxonomic profiling of metagenomic data. SCIENTIFIC REPORTS, 5 (9743): 1-7. doi: 10.1038/srep09743 handle: http://hdl.handle.net/10449/25230

MICCA: a complete and accurate software for taxonomic profiling of metagenomic data

Albanese, Davide;Fontana, Paolo;De Filippo, Carlotta;Cavalieri, Duccio;Donati, Claudio
2015-01-01

Abstract

The introduction of high throughput sequencing technologies has triggered an increase of the number of studies in which the microbiota of environmental and human samples is characterized through the sequencing of selected marker genes. While experimental protocols have undergone a process of standardization that makes them accessible to a large community of scientist, standard and robust data analysis pipelines are still lacking. Here we introduce MICCA, a software pipeline for the processing of amplicon metagenomic datasets that efficiently combines quality filtering, clustering of Operational Taxonomic Units (OTUs), taxonomy assignment and phylogenetic tree inference. MICCA provides accurate results reaching a good compromise among modularity and usability. Moreover, we introduce a de-novo clustering algorithm specifically designed for the inference of Operational Taxonomic Units (OTUs). Tests on real and synthetic datasets shows that thanks to the optimized reads filtering process and to the new clustering algorithm, MICCA provides estimates of the number of OTUs and of other common ecological indices that are more accurate and robust than currently available pipelines. Analysis of public metagenomic datasets shows that the higher consistency of results improves our understanding of the structure of environmental and human associated microbial communities. MICCA is an open source project
Data mining
Software
Settore ING-INF/06 - BIOINGEGNERIA ELETTRONICA E INFORMATICA
2015
Albanese, D.; Fontana, P.; De Filippo, C.; Cavalieri, D.; Donati, C. (2015). MICCA: a complete and accurate software for taxonomic profiling of metagenomic data. SCIENTIFIC REPORTS, 5 (9743): 1-7. doi: 10.1038/srep09743 handle: http://hdl.handle.net/10449/25230
File in questo prodotto:
File Dimensione Formato  
2015 SR Albanese et al.pdf

accesso aperto

Licenza: Creative commons
Dimensione 731.02 kB
Formato Adobe PDF
731.02 kB Adobe PDF Visualizza/Apri

Questo articolo è pubblicato sotto una Licenza Licenza Creative Commons Creative Commons

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10449/25230
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 210
  • ???jsp.display-item.citation.isi??? 193
social impact