Incorporation of biological knowledge into distance for clustering genes
- Resource Type
- Authors
- Grzegorz M. Boratyn; Susmita Datta; Somnath Datta
- Source
- Bioinformation
- Subject
- knowledge
0303 health sciences
Computer science
Process (engineering)
Experimental data
020206 networking & telecommunications
02 engineering and technology
General Medicine
computer.software_genre
03 medical and health sciences
ComputingMethodologies_PATTERNRECOGNITION
Prediction Model
expression
0202 electrical engineering, electronic engineering, information engineering
Data mining
distance
genes
Cluster analysis
computer
Gene
clustering
030304 developmental biology
- Language
- ISSN
- 0973-2063
0973-8894
UNLABELLED In this paper we propose a data based algorithm to marry existing biological knowledge (e.g., functional annotations of genes) with experimental data (gene expression profiles) in creating an overall dissimilarity that can be used with any clustering algorithm that uses a general dissimilarity matrix. We explore this idea with two publicly available gene expression data sets and functional annotations where the results are compared with the clustering results that uses only the experimental data. Although more elaborate evaluations might be called for, the present paper makes a strong case for utilizing existing biological information in the clustering process. AVAILABILITY Supplement is available at www.somnathdatta.org/Supp/Bioinformation/appendix.pdf.