CINECA IRIS Institutional Research Information System

Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous itemset mining algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common bioinformatics problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features

Naulaerts, S.; Moens, S.; Meysman, P.; Engelen, K.A.; Vanden Berghe, W.; Goethals, B.; Laukens, K.; Meysman, P. (2016). Practical approaches for mining frequent patterns in molecular datasets. BIOINFORMATICS AND BIOLOGY INSIGHTS, 10: 37-47. doi: 10.4137/BBI.S38419 handle: http://hdl.handle.net/10449/25553

Practical approaches for mining frequent patterns in molecular datasets

Naulaerts, S.;Moens, S.;Meysman, P.;Engelen, Kristof Arthur;Vanden Berghe, W.;Goethals, B.;Laukens, K.;Meysman, P.

2016-01-01

Abstract

Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous itemset mining algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common bioinformatics problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features

Scheda breve

Scheda completa

Scheda completa (DC)

	Keywords
	
				Mycobacterium tuberculosis
Frequent itemset mining
Gene expression
Protein domain structure
Protein–protein interaction
			
	MIUR subjects (validi fino a 24/06/2024)
	
				Settore MAT/06 - PROBABILITÀ E STATISTICA MATEMATICA
			
	Date of issue
	
				2016
			
	Citazione
	
				Naulaerts, S.; Moens, S.; Meysman, P.; Engelen, K.A.; Vanden Berghe, W.; Goethals, B.; Laukens, K.; Meysman, P. (2016). Practical approaches for mining frequent patterns in molecular datasets. BIOINFORMATICS AND BIOLOGY INSIGHTS, 10: 37-47. doi: 10.4137/BBI.S38419 handle: http://hdl.handle.net/10449/25553
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
bbi-10-2016-037.pdf accesso aperto Licenza: Creative commons Dimensione 5.57 MB Formato Adobe PDF Visualizza/Apri	5.57 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10449/25553

Citazioni

ND

2

0

social impact