Neighborhood matters: Predicting lung cancer screening adherence with explainable AI.

Publication Title

Lung cancer (Amsterdam, Netherlands)

Document Type

Article

Publication Date

1-1-2026

Keywords

california; plcmmc; burbank; artificial intelligence; diversity

Abstract

Background: This study builds a predictive model for lung cancer screening (LCS) adherence using social determinants of health (SDOH) data in high-risk populations. By identifying key factors influencing non-adherence, we seek to improve risk stratification for individuals less likely to complete annual LCS follow-up scans within 15-months.

Methods: We recruited 188 minoritized individuals meeting high-risk smoking pack year criteria who underwent their first low-dose computed tomography (LDCT) scan between 2017 and 2021 at four clinical centers in Los Angeles County. Participants completed an IRB-approved survey assessing demographics, tobacco use, social needs, discrimination, and lung cancer risk perception. Residential address at time of first LDCT was geocoded to match with neighborhood-level SDOH metrics. The data were split into training (N = 145) and testing cohorts (N = 43) by whether individuals received their initial LDCT by June 30, 2021. Electronic medical records were checked for LDCT follow-up within 15 months of initial LCS. Those who underwent the subsequent LDCT within 15 months of the initial LCS were considered adherent. We trained an XGBoost classifier with hyperparameter tuning and performed SHapley Additive exPlanations (SHAP) analysis to interpret model predictions.

Results: The cohort included 69 (37 %) Asian/Pacific Islander, 53 (28 %) Black/African American, and 49 (26 %) Hispanic/Latino participants. The LCS non-adherence rate was 66 %. The XGBoost classifier achieved an AUROC of 0.81 and AUPRC of 0.90, with prediction performance of accuracy = 0.79, recall = 0.78, specificity = 0.81, positive predictive value = 0.88, and negative predictive value = 0.68. SHAP analysis indicated that neighborhood-level SDOH factors, such as school proficiency and poverty levels, were more predictive of non-adherence than individual-level factors like smoking status.

Conclusions: This machine learning approach accurately predicted LCS non-adherence using individual- and neighborhood-level SDOH factors. These findings emphasize the relevance of community-level characteristics in informing LCS adherence interventions and may support the development of regionally tailored strategies to improve adherence in high-risk populations.

Area of Special Interest

Cancer

Specialty/Research Institute

Oncology

Specialty/Research Institute

Pulmonary Medicine

DOI

10.1016/j.lungcan.2025.108882

Share

COinS