Session Information
Date: Sunday, November 12, 2023
Title: (0176–0195) Healthcare Disparities in Rheumatology Poster I: Lupus
Session Type: Poster Session A
Session Time: 9:00AM-11:00AM
Background/Purpose: Social determinants of health (SDoH) such as financial insecurity contribute to disparities in rheumatic disease care and outcomes but are not routinely included in structured electronic health record (EHR) data, (e.g., ICD-10 billing codes). SDoH described in clinical notes are not readily extractable and therefore cannot be easily incorporated into research studies. We leveraged natural language processing (NLP) to extract terms related to financial insecurity and used machine learning models to develop and validate an algorithm to identify individuals with this critical SDoH.
Methods: We randomly selected 600 patients from 20,395 with rheumatic or musculoskeletal conditions enrolled in an integrated care management program (iCMP) between 1/1/12-10/18/21. iCMP provides care for medically and psychosocially complex patients. The study team (social epidemiologists, pediatric and adult rheumatologists, bioinformaticians) defined the construct “financial insecurity” using nominal group technique. Reviewers (MTC, SU, CHF) operationalized this definition with manual EHR reviews to establish the gold standard. Individuals were classified as having definite, possible, or no financial insecurity in separate training and validation cohorts. We constructed a context-driven lexicon containing terms for financial insecurity using data from PubMed, the Unified Medical Language System, and previous EHR reviews (Table 1). All available notes were then processed using NLP with the context-driven lexicon. We developed models using logistic regression, LASSO regression, and random forest, trained on EHR-based review of cases of financial insecurity (definite or definite and possible combined) and determined the performance metrics for each model.
Results: Among 600 identified patients, we excluded 62 due to lack of notes, clear rheumatologic diagnoses, or iCMP enrollment confirmation (N=538). 245,142 notes were processed from the training (N=366) and validation cohorts (N=172). Financial insecurity was present among 100 individuals (27%) in the training cohort and 63 (37%) in the validation cohort (Table 2). All models (logistic regression, LASSO, random forest) classifying the presence of financial insecurity performed similarly regardless of the algorithm used, with logistic regression models achieving the overall highest positive predictive value (PPV) of 0.98. (Table 3). The logistic regression models had specificities ranging from 0.94-0.98, sensitivities ranging from 0.27-0.54 and PPVs of 0.89-0.91. LASSO regression models had specificities ranging from 0.98-0.99, sensitivities of 0.20-0.29, and PPVs of 0.90-0.95. The random forest models had specificities ranging from 0.96-0.98, sensitivities of 0.29-0.48, and PPVs of 0.90-0.94.
Conclusion: Using a context-driven general lexicon for financial insecurity, NLP enabled the development of algorithms to classify individuals with terms or phrases indicative of financial insecurity in free-text EHR notes. These models with high positive predictive values could be leveraged to identify patients with this SDoH for future health equity interventions.
To cite this abstract in AMA style:
Chandler M, Cai T, Santacroce L, Ulysse S, Liao K, Feldman C. Classifying Individuals with Rheumatic Conditions as Financially Insecure Using Electronic Health Record Data and Natural Language Processing: Algorithm Derivation and Validation [abstract]. Arthritis Rheumatol. 2023; 75 (suppl 9). https://acrabstracts.org/abstract/classifying-individuals-with-rheumatic-conditions-as-financially-insecure-using-electronic-health-record-data-and-natural-language-processing-algorithm-derivation-and-validation/. Accessed .« Back to ACR Convergence 2023
ACR Meeting Abstracts - https://acrabstracts.org/abstract/classifying-individuals-with-rheumatic-conditions-as-financially-insecure-using-electronic-health-record-data-and-natural-language-processing-algorithm-derivation-and-validation/