Articles |

Molecular Classification of Lung Cancer*: A Cross-Platform Comparison of Gene Expression Data Sets FREE TO VIEW

Giovanni Parmigiani, PhD; Elizabeth Garrett, PhD; Ramaswamy Anbazhagan, MD, PhD; Edward Gabrielson, PhD
Author and Funding Information

*From the Johns Hopkins University School of Medicine, Baltimore, MD.

Correspondence to: Edward Gabrielson, PhD, Department of Pathology and Oncology, Johns Hopkins School of Medicine, 418 N Bond St, Suite 301, Baltimore, MD 21231-1001

Chest. 2004;125(5_suppl):103S. doi:10.1378/chest.125.5_suppl.103S
Text Size: A A A
Published online

Several recent studies have sought to refine the classification of lung cancer through gene expression profiling, using complementary DNA and oligonucleotide microarray platforms. To initiate the process of cross-validating and integrating the results of these studies, we developed statistical approaches that allow overall assessments to be made of profile similarities, as well as comparisons of individual genes for association with outcomes. Focusing our analysis on three lung cancer-profiling projects, we first compared the data from these studies for consistency of the coexpression relationships among pairs of genes. We computed all possible pairwise correlations within the study and computed the correlation of the resulting values across studies. Using these relationships as reflections of general consistency across projects, we noted that pairwise correlation coefficients ranged from 0.33 to 0.54. While this represents a considerable level of variability across studies, the distribution of gene-pair correlations clearly indicates that subsets of “consistent” genes have reasonably similar coordinate expression patterns across studies.

We then compared studies for associations of specific genes to outcomes, focusing first on the gene expression patterns reported by two of the projects to be associated with the differentiation of squamous cell carcinoma from adenocarcinoma. The correlation coefficient for this comparison using all expressed genes was 0.85, but by using only the consistent genes identified by the analysis of coordinate gene expression, the overall unexplained variation between the studies was reduced by approximately 50%. Similarly, we compared all three data sets for relationships of gene expression to patient survival. Using all expressed genes in the three-way comparison, correlation coefficients of standardized Cox hazard log ratios ranged from 0.22 to 0.40. Using only the consistent genes, unexplained variability was reduced by approximately 25%. This systematic approach to comparing data across these different gene expression profiling projects allowed us to assess the consistency of results across different projects and to recognize subsets of genes that could be reasonably compared for validation purposes.




Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

Find Similar Articles
CHEST Journal Articles
PubMed Articles
  • CHEST Journal
    Print ISSN: 0012-3692
    Online ISSN: 1931-3543