Accounting for EGFR Mutations in Epidemiologic Analyses of Non-Small Cell Lung Cancers: Examples Based on the International Lung Cancer Consortium Data.

Document Type


Publication Date


Publication Title

Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology


Carcinoma, Non-Small-Cell Lung; ErbB Receptors; Humans; Lung Neoplasms; Mutation; Survival Analysis; washington; swedish cancer


BACKGROUND: Somatic EGFR mutations define a subset of non-small cell lung cancers (NSCLC) that have clinical impact on NSCLC risk and outcome. However, EGFR-mutation-status is often missing in epidemiologic datasets. We developed and tested pragmatic approaches to account for EGFR-mutation-status based on variables commonly included in epidemiologic datasets and evaluated the clinical utility of these approaches.

METHODS: Through analysis of the International Lung Cancer Consortium (ILCCO) epidemiologic datasets, we developed a regression model for EGFR-status; we then applied a clinical-restriction approach using the optimal cut-point, and a second epidemiologic, multiple imputation approach to ILCCO survival analyses that did and did not account for EGFR-status.

RESULTS: Of 35,356 ILCCO patients with NSCLC, EGFR-mutation-status was available in 4,231 patients. A model regressing known EGFR-mutation-status on clinical and demographic variables achieved a concordance index of 0.75 (95% CI, 0.74-0.77) in the training and 0.77 (95% CI, 0.74-0.79) in the testing dataset. At an optimal cut-point of probability-score = 0.335, sensitivity = 69% and specificity = 72.5% for determining EGFR-wildtype status. In both restriction-based and imputation-based regression analyses of the individual roles of BMI on overall survival of patients with NSCLC, similar results were observed between overall and EGFR-mutation-negative cohort analyses of patients of all ancestries. However, our approach identified some differences: EGFR-mutated Asian patients did not incur a survival benefit from being obese, as observed in EGFR-wildtype Asian patients.

CONCLUSIONS: We introduce a pragmatic method to evaluate the potential impact of EGFR-status on epidemiological analyses of NSCLC.

IMPACT: The proposed method is generalizable in the common occurrence in which EGFR-status data are missing.

Clinical Institute





Pulmonary Medicine