Motivation: Several open-source tools have been recently developed to identify Single Nucleotide Polymorphisms (SNPs) in whole-genome data, the most popular being Samtools and GATK. Commonly, SNP predictors provide a VCF file as output, which contains a list of candidate SNPs with several informations such as SNP call quality, depth of coverage and many other parameters. Still this SNP list presents an unsatisfactory accuracy due to high false positive polymorphism prediction. Results: The VCF parameters are used to train a Support Vector Machine (SVM) that will classify the VCF SNP list in two groups: true SNPs and false positive results, with a prediction accuracy much more higher than the first predictor it-self.
Leonardelli, L.; Cestaro, A.; Livi, C.M.; This, P.; Romieu, C.; Blanzieri, E.; Moser, C. (2013). VerySNP: VCF features to train SVM in crop SNP detection. In: 21th Annual Internation Conference on Intelligence Systems for Molecular Biology, 12th European Conference on Computational Biology, Berlin, July 19th - 23rd. url: http://f1000.com/posters/browse/summary/1093922 handle: http://hdl.handle.net/10449/22327
VerySNP: VCF features to train SVM in crop SNP detection
Leonardelli, Lorena;Cestaro, Alessandro;Moser, Claudio
2013-01-01
Abstract
Motivation: Several open-source tools have been recently developed to identify Single Nucleotide Polymorphisms (SNPs) in whole-genome data, the most popular being Samtools and GATK. Commonly, SNP predictors provide a VCF file as output, which contains a list of candidate SNPs with several informations such as SNP call quality, depth of coverage and many other parameters. Still this SNP list presents an unsatisfactory accuracy due to high false positive polymorphism prediction. Results: The VCF parameters are used to train a Support Vector Machine (SVM) that will classify the VCF SNP list in two groups: true SNPs and false positive results, with a prediction accuracy much more higher than the first predictor it-self.File | Dimensione | Formato | |
---|---|---|---|
2013 Leonardelli.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
5.22 MB
Formato
Adobe PDF
|
5.22 MB | Adobe PDF | Visualizza/Apri |
2013 Leonardelli 2.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
94.45 kB
Formato
Adobe PDF
|
94.45 kB | Adobe PDF | Visualizza/Apri |
Questo articolo è pubblicato sotto una Licenza Licenza Creative Commons