Biomarker selection is an important topic in the omics sciences, where holistic measurement methods routinely generate results for many variables simultaneously. Very often, only a small fraction of these variables are really associated with the phenomena of interest. Selection and identification of these biomarkers is essential for obtaining an understanding of the complex biological processes under study. Finding biomarkers, however, is a difficult task. Even if a relative order can be established, e.g., on the basis of p values, it is usually hard to determine where to stop including candidates in the final set. Higher Criticism is an approach for finding data-dependent cutoff values when comparing two distinct groups of samples. Here, we extend its use to multivariate data, providing a principled approach to compromise between not selecting too many variables and catching as many true positives as possible. The results show a marked improvement in biomarker selection, compared to the standard settings available for some methods. Interestingly, HC thresholds can differ considerably from what has been suggested in literature before, again showing that it is not possible to use the same cutoff value for all data sets. The data-specific cutoff values provided by HC also open the way to more fair comparisons between biomarker selection methods, not biased by unlucky or suboptimal threshold choices

Wehrens, H.R.M.J.; Franceschi, P. (2012). Thresholding for biomarker selection in multivariate data using Higher Criticism. MOLECULAR BIOSYSTEMS, 8 (9): 2339-2346. doi: 10.1039/C2MB25121C handle: http://hdl.handle.net/10449/21161

Thresholding for biomarker selection in multivariate data using Higher Criticism

Wehrens, Herman Ronald Maria Johan;Franceschi, Pietro
2012-01-01

Abstract

Biomarker selection is an important topic in the omics sciences, where holistic measurement methods routinely generate results for many variables simultaneously. Very often, only a small fraction of these variables are really associated with the phenomena of interest. Selection and identification of these biomarkers is essential for obtaining an understanding of the complex biological processes under study. Finding biomarkers, however, is a difficult task. Even if a relative order can be established, e.g., on the basis of p values, it is usually hard to determine where to stop including candidates in the final set. Higher Criticism is an approach for finding data-dependent cutoff values when comparing two distinct groups of samples. Here, we extend its use to multivariate data, providing a principled approach to compromise between not selecting too many variables and catching as many true positives as possible. The results show a marked improvement in biomarker selection, compared to the standard settings available for some methods. Interestingly, HC thresholds can differ considerably from what has been suggested in literature before, again showing that it is not possible to use the same cutoff value for all data sets. The data-specific cutoff values provided by HC also open the way to more fair comparisons between biomarker selection methods, not biased by unlucky or suboptimal threshold choices
VIP values
2012
Wehrens, H.R.M.J.; Franceschi, P. (2012). Thresholding for biomarker selection in multivariate data using Higher Criticism. MOLECULAR BIOSYSTEMS, 8 (9): 2339-2346. doi: 10.1039/C2MB25121C handle: http://hdl.handle.net/10449/21161
File in questo prodotto:
File Dimensione Formato  
2012 MB Wehrens et al.pdf

non disponibili

Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.51 MB
Formato Adobe PDF
2.51 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10449/21161
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 10
social impact