Session Information
Session Type: Poster Session A
Session Time: 9:00AM-11:00AM
Background/Purpose: Administrative claims databases represent important settings for studying large populations with juvenile idiopathic arthritis (JIA), but prior efforts to validate diagnostic algorithms for JIA using administrative data have been limited to potentially non-generalizable settings (e.g., specific clinics or health care systems). We aimed to develop and validate algorithms for new diagnoses of JIA in a large claims database using rule-based and machine learning-based approaches.
Methods: We performed a cross-sectional validation study using US commercial health plan data (2013-2020). We identified children diagnosed with JIA (ICD-9-CM: 696.0, 714, 720; ICD-10-CM: L40.5, M05, M06, M08, M45) before age 18 following ≥12 months of baseline continuous enrollment without JIA diagnosis or immunosuppression. JIA diagnoses were based on 3 previously validated definitions: 1) rheumatologist’s diagnosis plus orders for ≥2 specific laboratory tests; 2) ≥2 outpatient diagnoses 8-52 weeks apart; or 3) 1 inpatient diagnosis. Charts from a random subset of subjects meeting each definition were abstracted and independently adjudicated by clinical experts; discrepancies were resolved by a third expert or, where necessary, consensus. Incident JIA was defined as definite or probable JIA diagnosed in the prior 4 months. Using data from 1 year before through 1 year after first JIA diagnosis, we then created candidate predictor variables from demographics, diagnoses, medications, procedures, and specialty of clinicians diagnosing JIA. After applying a simulation-based balancing method (Synthetic Minority Oversampling Technique, SMOTE), we selected optimal logistic regression regularization hyperparameters using 10-fold cross-validation. Model variables were used to score observations, and sensitivity, specificity, and positive predictive value (PPV) [95% confidence interval (CI)] were assessed at different thresholds of predicted JIA probability.
Results: Of 182 eligible charts reviewed (92 ICD-9-based, 90 ICD-10-based), 133 had definite/probable JIA (ICD-9 64%, ICD-10 82%). Of JIA diagnoses, 90 were incident (ICD-9 90%, ICD-10 50%). Rule-based algorithms had limited PPV for incident JIA (ICD-9 58%, ICD-10 41%) (Table). Use of machine-learning based algorithms enabled excellent discrimination between incident and prevalent JIA (ICD-9 AUC 0.97, ICD-10 AUC 0.88) and between incident JIA and unlikely JIA (ICD-9 AUC 0.99, ICD-10 AUC 0.94) (Figure 1). Specific predicted probability thresholds yielded excellent test characteristics for differentiating incident JIA from unlikely JIA (ICD-9: sensitivity 95%, specificity 96%, PPV 96% [95% CI 96-100%]; ICD-10: sensitivity 81%, specificity 92%, PPV 91% [95% CI 84-97%]) (Figure 2).
Conclusion: Machine learning-based diagnostic algorithms for incident JIA enhanced traditional rule-based algorithms in identifying new diagnoses of JIA using ICD-9 and ICD-10 codes within a large US claims database. External validation of these models is warranted, but these algorithms will facilitate use of administrative data to study JIA diagnosis, management, and outcomes in large populations.
To cite this abstract in AMA style:
Hoffman P, Parlett L, Beachler D, Reiff D, McGuire S, Pothraj S, Moorthy L, Salvant C, Koffman D, Rege S, Huang C, Iozzio M, Schott K, Haynes K, Davidow A, Crystal S, Gerhard T, Strom B, Rose C, Horton D. Validation of Claims-based Algorithms for Newly Diagnosed Juvenile Idiopathic Arthritis Using Machine Learning Methods [abstract]. Arthritis Rheumatol. 2023; 75 (suppl 9). https://acrabstracts.org/abstract/validation-of-claims-based-algorithms-for-newly-diagnosed-juvenile-idiopathic-arthritis-using-machine-learning-methods/. Accessed .« Back to ACR Convergence 2023
ACR Meeting Abstracts - https://acrabstracts.org/abstract/validation-of-claims-based-algorithms-for-newly-diagnosed-juvenile-idiopathic-arthritis-using-machine-learning-methods/