Skip to content
2000
Volume 12, Issue 6
  • ISSN: 1574-8936
  • E-ISSN: 2212-392X

Abstract

Background: Cluster analysis techniques of gene expression microarray data is of increasing interest in the field of current bioinformatics. One of the reasons for this is the need for molecular-based refinement of broadly defined biological classes, with implications in cancer diagnosis, prognosis and treatment. And many algorithms have been developed for this problem. Objective: However microarray data frequently include outliers, and how to treat these outlier's effects in the subsequent analysis-clustering. Method: In this paper, we present the large-scale analysis of seven different agglomerative hierarchical clustering methods and five proximity measures for the analysis of 33 cancer gene expression datasets. As a case study, we used two experimental datasets: Affymetrix and cDNA, and different percent outliers were artificially added to these datasets. Results: We found that ward method gives the highest corrected Rand index value with respect to the spearman proximity measures when datasets contain with and without outliers. Conclusion: This study proves that ward method is more robust clustering methods in gene expression data analysis among other methods.

Loading

Article metrics loading...

/content/journals/cbio/10.2174/1574893611666160610103926
2017-12-01
2025-09-03
Loading full text...

Full text loading...

/content/journals/cbio/10.2174/1574893611666160610103926
Loading
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test