Motivation: Several open-source tools have been recently developed to identify Single Nucleotide Polymorphisms (SNPs) in whole-genome data, the most popular being Samtools and GATK. Commonly, SNP predictors provide a VCF file as output, which contains a list of candidate SNPs with several informations such as SNP call quality, depth of coverage and many other parameters. Still this SNP list presents an unsatisfactory accuracy due to high false positive polymorphism prediction. Results: The VCF parameters are used to train a Support Vector Machine (SVM) that will classify the VCF SNP list in two groups: true SNPs and false positive results, with a prediction accuracy much more higher than the first predictor it-self.

Leonardelli, L.; Cestaro, A.; Livi, C.M.; This, P.; Romieu, C.; Blanzieri, E.; Moser, C. (2013). VerySNP: VCF features to train SVM in crop SNP detection. In: 21th Annual Internation Conference on Intelligence Systems for Molecular Biology, 12th European Conference on Computational Biology, Berlin, July 19th - 23rd. url: http://f1000.com/posters/browse/summary/1093922 handle: http://hdl.handle.net/10449/22327

VerySNP: VCF features to train SVM in crop SNP detection

Leonardelli, Lorena;Cestaro, Alessandro;Moser, Claudio
2013-01-01

Abstract

Motivation: Several open-source tools have been recently developed to identify Single Nucleotide Polymorphisms (SNPs) in whole-genome data, the most popular being Samtools and GATK. Commonly, SNP predictors provide a VCF file as output, which contains a list of candidate SNPs with several informations such as SNP call quality, depth of coverage and many other parameters. Still this SNP list presents an unsatisfactory accuracy due to high false positive polymorphism prediction. Results: The VCF parameters are used to train a Support Vector Machine (SVM) that will classify the VCF SNP list in two groups: true SNPs and false positive results, with a prediction accuracy much more higher than the first predictor it-self.
SNP detection
SVM
VCF
SAMtools
GATK
Crops
Identificazione di SNPs
SVM
VCF
SAMtools
GATK
Piante
Leonardelli, L.; Cestaro, A.; Livi, C.M.; This, P.; Romieu, C.; Blanzieri, E.; Moser, C. (2013). VerySNP: VCF features to train SVM in crop SNP detection. In: 21th Annual Internation Conference on Intelligence Systems for Molecular Biology, 12th European Conference on Computational Biology, Berlin, July 19th - 23rd. url: http://f1000.com/posters/browse/summary/1093922 handle: http://hdl.handle.net/10449/22327
File in questo prodotto:
File Dimensione Formato  
2013 Leonardelli.pdf

accesso aperto

Licenza: Creative commons
Dimensione 5.22 MB
Formato Adobe PDF
5.22 MB Adobe PDF Visualizza/Apri
2013 Leonardelli 2.pdf

accesso aperto

Licenza: Creative commons
Dimensione 94.45 kB
Formato Adobe PDF
94.45 kB Adobe PDF Visualizza/Apri

Questo articolo è pubblicato sotto una Licenza Licenza Creative Commons Creative Commons

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10449/22327
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact