Motivation: Although NHL (non-Hodgkin's lymphoma) is the fifth leading cause of cancer incidence and mortality in the USA, it remains poorly understood and is largely incurable. Biomedical studies have shown that genomic variations, measured with SNPs (single nucleotide polymorphisms) in genes, may have independent predictive power for disease-free survival in NHL patients beyond clinical measurements.Results: We apply the CTGDR (clustering threshold gradient directed regularization) method to genetic association studies using SNPs, analyze data from an association study of NHL and identify prognosis signatures to diffuse large B cell lymphoma (DLBCL) and follicular lymphoma (FL), the two most common subtypes of NHL. With the CTGDR method, we are able to account for the joint effects of multiple genes/SNPs, whereas most existing studies are single-marker based. In addition, we are able to account for the 'gene and SNP-within-gene' hierarchical structure and identify not only predictive genes but also predictive SNPs within identified genes. In contrast, existing studies are limited to either gene or SNP identification, but not both. We propose using resampling methods to evaluate the predictive power and reproducibility of identified genes and SNPs. Simulation study and data analysis suggest satisfactory performance of the CTGDR method.
ASJC Scopus subject areas
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics