Neighborhood matters: Predicting lung cancer screening adherence with explainable AI.
Publication Title
Lung cancer (Amsterdam, Netherlands)
Document Type
Article
Publication Date
1-1-2026
Keywords
california; plcmmc; burbank; artificial intelligence; diversity
Abstract
Background: This study builds a predictive model for lung cancer screening (LCS) adherence using social determinants of health (SDOH) data in high-risk populations. By identifying key factors influencing non-adherence, we seek to improve risk stratification for individuals less likely to complete annual LCS follow-up scans within 15-months.
Methods: We recruited 188 minoritized individuals meeting high-risk smoking pack year criteria who underwent their first low-dose computed tomography (LDCT) scan between 2017 and 2021 at four clinical centers in Los Angeles County. Participants completed an IRB-approved survey assessing demographics, tobacco use, social needs, discrimination, and lung cancer risk perception. Residential address at time of first LDCT was geocoded to match with neighborhood-level SDOH metrics. The data were split into training (N = 145) and testing cohorts (N = 43) by whether individuals received their initial LDCT by June 30, 2021. Electronic medical records were checked for LDCT follow-up within 15 months of initial LCS. Those who underwent the subsequent LDCT within 15 months of the initial LCS were considered adherent. We trained an XGBoost classifier with hyperparameter tuning and performed SHapley Additive exPlanations (SHAP) analysis to interpret model predictions.
Results: The cohort included 69 (37 %) Asian/Pacific Islander, 53 (28 %) Black/African American, and 49 (26 %) Hispanic/Latino participants. The LCS non-adherence rate was 66 %. The XGBoost classifier achieved an AUROC of 0.81 and AUPRC of 0.90, with prediction performance of accuracy = 0.79, recall = 0.78, specificity = 0.81, positive predictive value = 0.88, and negative predictive value = 0.68. SHAP analysis indicated that neighborhood-level SDOH factors, such as school proficiency and poverty levels, were more predictive of non-adherence than individual-level factors like smoking status.
Conclusions: This machine learning approach accurately predicted LCS non-adherence using individual- and neighborhood-level SDOH factors. These findings emphasize the relevance of community-level characteristics in informing LCS adherence interventions and may support the development of regionally tailored strategies to improve adherence in high-risk populations.
Area of Special Interest
Cancer
Specialty/Research Institute
Oncology
Specialty/Research Institute
Pulmonary Medicine
DOI
10.1016/j.lungcan.2025.108882