Semi-Parametric Model-Based Clustering for DNA Microarray Data

TitleSemi-Parametric Model-Based Clustering for DNA Microarray Data
Publication TypeConference Papers
Year of Publication2006
AuthorsHan B, Davis LS
Conference NamePattern Recognition, 2006. ICPR 2006. 18th International Conference on
Date Published2006//00/0
Keywordsclustering;, clustering;DNA;Gaussian, computing;genetics;pattern, data;Gaussian, data;maximum, DNA, Expression, fitting;data, kernel;Gaussian, likelihood;mean-shift, maximization;gene, Microarray, mixtures;curvature, model-based, procedure;semiparametric, processes;biology, representation;expectation
Abstract

Various clustering methods have been proposed for the analysis of gene expression data, but conventional clustering algorithms have several critical limitations; how to set parameters such as number of clusters, initial cluster centers, and so on. In this paper, we propose a semi-parametric model-based clustering algorithm in which the underlying model is a mixture of Gaussian. Each gene expression data builds a Gaussian kernel, and the uncertainty of microarray data is naturally integrated in the data representation. Our algorithm provides a principled method to automatically determine parameters - number of components in the mixture, mean, covariance, and weight of each Gaussian - by mean-shift procedure (Comaniciu and Meer, 1999) and curvature fitting. After the initialization, expectation maximization (EM) algorithm is employed for clustering to achieve maximum likelihood (ML). The performance of our algorithm is compared with standard EM algorithm using real data as well as synthetic data

DOI10.1109/ICPR.2006.1044