A two-way semilinear model for normalization and analysis of cDNA microarray data

Jian Huang, Deli Wang, Cun Hui Zhang

Research output: Journal article publicationReview articleAcademic researchpeer-review

31 Citations (Scopus)

Abstract

A basic question in analyzing cDNA microarray data is normalization, the purpose of which is to remove systematic bias in the observed expression values by establishing a normalization curve across the whole dynamic range. A proper normalization procedure ensures that the normalized intensity ratios provide meaningful measures of relative expression levels. We propose a two-way semilinear model (TW-SLM) for normalization and analysis of microarray data. This method does not make the usual assumptions underlying some of the existing methods. For example, it does not assume that the percentage of differentially expressed genes is small or that there is symmetry in the expression levels of up-regulated and down-regulated genes, as required in the lowess normalization method. The TW-SLM also naturally incorporates uncertainty due to normalization into significance analysis of microarrays. We use a semiparametric approach based on polynomial splines in the TW-SLM to estimate the normalization curves and the normalized expression values. We study the theoretical properties of the proposed estimator in the TW-SLM, including the finite-sample distributional properties of the estimated gene effects and the rate of convergence of the estimated normalization curves when the number of genes under study is large. We also conduct simulation studies to evaluate the TW-SLM method and illustrate the proposed method using a published microarray dataset.

Original languageEnglish
Pages (from-to)814-829
Number of pages16
JournalJournal of the American Statistical Association
Volume100
Issue number471
DOIs
Publication statusPublished - 1 Sep 2005
Externally publishedYes

Keywords

  • Analysis of variance
  • Differentially expressed gene
  • High-dimensional data
  • Microarray
  • Noise level
  • Semiparametric regression
  • Spline
  • Variance estimation

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this