Identification of Robust Clustering Methods in Gene Expression Data Analysis

Md. B. Hossen; Md. Siraj-Ud-Doulah

doi:10.2174/1574893611666160610103926

ISSN: 1574-8936
E-ISSN: 2212-392X

Identification of Robust Clustering Methods in Gene Expression Data Analysis
By Md. B. Hossen and Md. Siraj-Ud-Doulah
Source: Current Bioinformatics, Volume 12, Issue 6, Dec 2017, p. 558 - 562
DOI: https://doi.org/10.2174/1574893611666160610103926
- Available online: 01 Dec 2017

Abstract

Background: Cluster analysis techniques of gene expression microarray data is of increasing interest in the field of current bioinformatics. One of the reasons for this is the need for molecular-based refinement of broadly defined biological classes, with implications in cancer diagnosis, prognosis and treatment. And many algorithms have been developed for this problem. Objective: However microarray data frequently include outliers, and how to treat these outlier's effects in the subsequent analysis-clustering. Method: In this paper, we present the large-scale analysis of seven different agglomerative hierarchical clustering methods and five proximity measures for the analysis of 33 cancer gene expression datasets. As a case study, we used two experimental datasets: Affymetrix and cDNA, and different percent outliers were artificially added to these datasets. Results: We found that ward method gives the highest corrected Rand index value with respect to the spearman proximity measures when datasets contain with and without outliers. Conclusion: This study proves that ward method is more robust clustering methods in gene expression data analysis among other methods.

Article metrics loading...

/content/journals/cbio/10.2174/1574893611666160610103926

2017-12-01

2026-02-22

From This Site

/content/journals/cbio/10.2174/1574893611666160610103926

dcterms_title,dcterms_subject,pub_keyword

-contentType:Contributor -contentType:Concept -contentType:Institution

10

5

Full text loading...

/content/journals/cbio/10.2174/1574893611666160610103926

Article Type: Research Article

Keyword(s): Agglomerative hierarchical clustering; corrected rand index; microarray gene expressions data; outlier; proximity measures

Most Cited Most Cited RSS feed

- A Review of Ensemble Methods in Bioinformatics
  
  Authors: Pengyi Yang, Yee Hwa Yang, Bing B. Zhou and Albert Y. Zomaya
- Bioinformatics Tools for Mass Spectroscopy-Based Metabolomic Data Processing and Analysis
  
  Authors: Masahiro Sugimoto, Masato Kawakami, Martin Robert, Tomoyoshi Soga and Masaru Tomita
- Distance-based Support Vector Machine to Predict DNA N6- methyladenine Modification
  
  Authors: Haoyu Zhang, Quan Zou, Ying Ju, Chenggang Song and Dong Chen
- A Review on the Recent Developments of Sequence-based Protein Feature Extraction Methods
  
  Authors: Jun Zhang and Bin Liu
- Molecular Genetic Markers: Discovery, Applications, Data Storage and Visualisation
  
  Authors: Chris Duran, Nikki Appleby, David Edwards and Jacqueline Batley
- A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization
  
  Authors: Wuritu Yang, Xiao-Juan Zhu, Jian Huang, Hui Ding and Hao Lin
- Cancer Diagnosis Through IsomiR Expression with Machine Learning Method
  
  Authors: Zhijun Liao, Dapeng Li, Xinrui Wang, Lisheng Li and Quan Zou
- Relevance of Molecular Docking Studies in Drug Designing
  
  Authors: Ritu Jakhar, Mehak Dangi, Alka Khichi and Anil K. Chhillar
- The Advances and Challenges of Deep Learning Application in Biological Big Data Processing
  
  Authors: Li Peng, Manman Peng, Bo Liao, Guohua Huang, Weibiao Li and Dingfeng Xie
- Gene Expression Profile Classification: A Review
  
  Authors: Musa H. Asyali, Dilek Colak, Omer Demirkaya and Mehmet S. Inan
More Less

Identification of Robust Clustering Methods in Gene Expression Data Analysis

Abstract

Most Read This Month

Most Cited Most Cited RSS feed