CINECA IRIS Institutional Research Information System

The development and the validation of innovative approaches for biomarker selection are of paramount importance in many -omics technologies. Unfortunately, the actual testing of new methods on real data is difficult, because in real data sets, one can never be sure about the “true” biomarkers. In this paper, we present a publicly available metabolomic ultra performance liquid chromatography–mass spectrometry spike-in data set for apples. The data set consists of 10 control samples and three spiked sets of the same size, where naturally occurring compounds are added in different concentrations. In this sense, the data set can serve as a test bed to assess the performance of new algorithms and compare them with previously published results. We illustrate some of the possibilities provided by this spike-in data set by comparing the performance of two popular biomarker-selection methods, the univariate t-test and the multivariate variable importance in projection. To promote a widespread use of the data, raw data files as well as preprocessed peak lists are made available.

Franceschi, P.; Masuero, D.; Vrhovsek, U.; Mattivi, F.; Wehrens, H.R.M.J. (2012). A benchmark spike-in data set for biomarker identification in metabolomics. JOURNAL OF CHEMOMETRICS, 26 (1): 16-24. doi: 10.1002/cem.1420 handle: http://hdl.handle.net/10449/20730

A benchmark spike-in data set for biomarker identification in metabolomics

Franceschi, Pietro;Masuero, Domenico;Vrhovsek, Urska;Mattivi, Fulvio;Wehrens, Herman Ronald Maria Johan

2012-01-01

Abstract

The development and the validation of innovative approaches for biomarker selection are of paramount importance in many -omics technologies. Unfortunately, the actual testing of new methods on real data is difficult, because in real data sets, one can never be sure about the “true” biomarkers. In this paper, we present a publicly available metabolomic ultra performance liquid chromatography–mass spectrometry spike-in data set for apples. The data set consists of 10 control samples and three spiked sets of the same size, where naturally occurring compounds are added in different concentrations. In this sense, the data set can serve as a test bed to assess the performance of new algorithms and compare them with previously published results. We illustrate some of the possibilities provided by this spike-in data set by comparing the performance of two popular biomarker-selection methods, the univariate t-test and the multivariate variable importance in projection. To promote a widespread use of the data, raw data files as well as preprocessed peak lists are made available.

Scheda breve

Scheda completa

Scheda completa (DC)

	Keywords
	
				Metabolomics
Biomarker Selection
Biostatistics
Spike-in data
Benchmark
			
	Keywords
	
				Metabolomica
Selezione di Biomarkers
Biostatistica
			
	MIUR subjects (validi fino a 24/06/2024)
	
				Settore CHIM/01 - CHIMICA ANALITICA
			
	Date of issue
	
				2012
			
	Citazione
	
				Franceschi, P.; Masuero, D.; Vrhovsek, U.; Mattivi, F.; Wehrens, H.R.M.J. (2012). A benchmark spike-in data set for biomarker identification in metabolomics. JOURNAL OF CHEMOMETRICS, 26 (1): 16-24. doi: 10.1002/cem.1420 handle: http://hdl.handle.net/10449/20730
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2012 JoC Franceschi et al.pdf non disponibili Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.68 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.68 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10449/20730

Citazioni

ND

37

34

36

social impact