We present a method that produces a list of genes that are candidates for Network Expansion by Subsetting and Ranking Aggregation (NESRA) and its application to gene regulatory networks. Our group has recently developed gene@home [3], a BOINC project [1] that permits to search for candidate genes for the expansion of a gene regulatory network using gene expression data. The project adopts intensive variable-subsetting strategies enabled by the computational power provided by the volunteers who join the project by means of the BOINC client, and exploits the PC algorithm for discovering putative causal relationships within each subset of variables. The PC algorithm, whose name derives from the initials of its authors [7] and PC* [2] are algorithms that discover causal relationships among variables. In particular, PC is based on the systematic testing for conditional independence of variables given subsets of other variables, comprehensively presented and evaluated by Kalish and colleagues [4] who proposed it also for gene network reconstruction [5]. NESRA is an algorithm which runs as a postprocessor of the gene@home project that has: 1) a procedure that systematically subsets the variables, runs the PC and ranks the genes; the subsetting is iterated several times and a ranked list of candidates is produced by counting the number of times a relationship is found; 2) several ranking steps are executed with different values of the dimension of the subsets and with different number of iterations producing several ranked lists; 3) the ranked lists are aggregated by using a state-of-the-art ranking aggregator. Here we show that a single ranking step is enough to outperform PC and PC*, but with some dependency on the parameters. Moreover, we show that the output ranking aggregation method is better that the average performance of the single ranking steps. Evaluations are done by means of the gene@home project on Arabidopsis thaliana including a comparison against ARACNE [6] (Table 1). Method k=5 k=10 k=20 k=55 NESRA 0.90 0.80 0.60 0.42 ARACNE 0.2 0.3 0.35 0.45 Table 1: A. thaliana, Expansion of the Flower Organ Specification Gene Regulatory Network. NESRA and ARACNE (default parameters) precision for different values k of the length of the gene list

Erculiani, L.; Galante, F.; Gallo, C.; Asnicar, F.; Masera, L.; Morettin, P.; Sella, N.; Tolio, T.; Malacarne, G.; Engelen, K.A.; Argentini, A.; Cavecchia, V.; Moser, C.; Blanzieri, E. (2015). Discovering candidates for gene network expansion by variable subsetting and ranking aggregation. In: Network Biology Community - ISMB meeting (NetBio _SIG_2015), Dublin, Ireland, July 10-14, 2015. url: http://www.iscb.org/ismbeccb2015 handle: http://hdl.handle.net/10449/25427

Discovering candidates for gene network expansion by variable subsetting and ranking aggregation

Malacarne, Giulia;Engelen, Kristof Arthur;Moser, Claudio;
2015-01-01

Abstract

We present a method that produces a list of genes that are candidates for Network Expansion by Subsetting and Ranking Aggregation (NESRA) and its application to gene regulatory networks. Our group has recently developed gene@home [3], a BOINC project [1] that permits to search for candidate genes for the expansion of a gene regulatory network using gene expression data. The project adopts intensive variable-subsetting strategies enabled by the computational power provided by the volunteers who join the project by means of the BOINC client, and exploits the PC algorithm for discovering putative causal relationships within each subset of variables. The PC algorithm, whose name derives from the initials of its authors [7] and PC* [2] are algorithms that discover causal relationships among variables. In particular, PC is based on the systematic testing for conditional independence of variables given subsets of other variables, comprehensively presented and evaluated by Kalish and colleagues [4] who proposed it also for gene network reconstruction [5]. NESRA is an algorithm which runs as a postprocessor of the gene@home project that has: 1) a procedure that systematically subsets the variables, runs the PC and ranks the genes; the subsetting is iterated several times and a ranked list of candidates is produced by counting the number of times a relationship is found; 2) several ranking steps are executed with different values of the dimension of the subsets and with different number of iterations producing several ranked lists; 3) the ranked lists are aggregated by using a state-of-the-art ranking aggregator. Here we show that a single ranking step is enough to outperform PC and PC*, but with some dependency on the parameters. Moreover, we show that the output ranking aggregation method is better that the average performance of the single ranking steps. Evaluations are done by means of the gene@home project on Arabidopsis thaliana including a comparison against ARACNE [6] (Table 1). Method k=5 k=10 k=20 k=55 NESRA 0.90 0.80 0.60 0.42 ARACNE 0.2 0.3 0.35 0.45 Table 1: A. thaliana, Expansion of the Flower Organ Specification Gene Regulatory Network. NESRA and ARACNE (default parameters) precision for different values k of the length of the gene list
Variable subsetting
Ranking
Candidate genes
Gene Network Expansion
Bioinformatics
Sottoinsieme di variabili
Classifica
Geni candidati
Espansione di rete genica
Bioinformatica
2015
Erculiani, L.; Galante, F.; Gallo, C.; Asnicar, F.; Masera, L.; Morettin, P.; Sella, N.; Tolio, T.; Malacarne, G.; Engelen, K.A.; Argentini, A.; Cavecchia, V.; Moser, C.; Blanzieri, E. (2015). Discovering candidates for gene network expansion by variable subsetting and ranking aggregation. In: Network Biology Community - ISMB meeting (NetBio _SIG_2015), Dublin, Ireland, July 10-14, 2015. url: http://www.iscb.org/ismbeccb2015 handle: http://hdl.handle.net/10449/25427
File in questo prodotto:
File Dimensione Formato  
Erculiani et al._NetBio_SIG_2015.pdf

accesso aperto

Descrizione: Abstract form
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 144.55 kB
Formato Adobe PDF
144.55 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10449/25427
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact