Abstract
Gene expression profiling has been extensively conducted in cancer research. The analysis of multiple independent cancer gene expression datasets may provide additional information and complement single-dataset analysis. In this study, we conduct multi-dataset analysis and are interested in evaluating the similarity of cancer-associated genes identified from different datasets. The first objective of this study is to briefly review some statistical methods that can be used for such evaluation. Both marginal analysis and joint analysis methods are reviewed. The second objective is to apply those methods to 26 Gene Expression Omnibus (GEO) datasets on five types of cancers.Our analysis suggests that for the same cancer, the marker identification results may vary significantly across datasets, and different datasets share few common genes. In addition, datasets on different cancers share few common genes. The shared genetic basis of datasets on the same or different cancers, which has been suggested in the literature, is not observed in the analysis of GEO data.
Original language | English |
---|---|
Pages (from-to) | 671-684 |
Number of pages | 14 |
Journal | Briefings in Bioinformatics |
Volume | 15 |
Issue number | 5 |
DOIs | |
Publication status | Published - 1 Jan 2013 |
Externally published | Yes |
Keywords
- Cancer gene expression study
- GEO
- Marker identification
- Similarity
ASJC Scopus subject areas
- Information Systems
- Molecular Biology