Session Information
Session Type: Poster Session A
Session Time: 8:30AM-10:30AM
Background/Purpose: Electronic Health Records (EHRs) store nearly all clinical data in one central location providing increased accessibility, accuracy, and security. At our institution, the Synthetic Derivative (SD) is a de-identified EHR containing over 3.2 million records that links to a DNA biorepository, BioVU. The objective of this study is to develop an algorithm to detect a highly sensitive and specific cohort of juvenile idiopathic arthritis (JIA) patients in the EHR. Within this cohort, we determine characteristics of the identified JIA patients.
Methods: We developed our cohort by searching International Classification of Diseases, Ninth (ICD-9) and Tenth (ICD-10-CM) Revisions and keywords in the SD. A priori, ICD-9 and ICD-10-CM codes and keywords clinically relevant to juvenile arthritis were selected (Table 1). Keywords were identified by a survey of pediatric rheumatologists at a single site. Keywords selected were included in at least 75% of survey responses and returned less than 3000 cases in individual searches of the SD. Uveitis returned >3000 cases in the SD but was retained due to recurring appearances in confirmed JIA charts. We then combined the ICD codes with the keywords for possible algorithms. Algorithms used varying ICD code counts and searched keywords by “and” or “or” functions. A training set of 200 random charts was identified from a search of ≥1 count of the ICD-9 and ICD-10-CM JIA codes. Case status was determined by a pediatric rheumatologist who required a rheumatology clinic note documenting a JIA diagnosis before age 20. Positive predictive values, sensitivities, and F-scores were calculated for each algorithm. The F-score is the harmonic mean of the PPV and sensitivity and is frequently used in bioinformatics, as it accounts for both PPV and sensitivity.
Results: We analyzed 21 algorithms and ranked them by F-score. Our highest performing algorithm required ≥4 ICD-9 or ICD-10-CM code counts and any of the selected keywords to be present. It identified 1,514 patients and produced an F-score of 0.87. Other high performing algorithms used similar search methods (Table 2). Demographic data and age at first diagnosis in the EHR were analyzed for the cohort with the highest performing algorithm. Our JIA population was 72% female and 81% Caucasian. Approximately 20% of the JIA population had EHR diagnosis in the first 3 years of life, and 84% had EHR diagnosis appearing before 16 years of life. The ICD code that appeared most frequently in our cohort was the ICD-9 code, “Polyarticular juvenile rheumatoid arthritis, chronic or unspecified” (714.30).
Conclusion: We have developed algorithms for accurately identifying JIA patients in the EHR. Combining ICD-9 codes, ICD-10-CM codes and keywords produced a more sensitive and specific cohort than using ICD-9 or ICD-10-CM codes alone. Requiring multiple instances of the ICD-9 or ICD-10-CM codes also improved algorithm performance. Our identified cohort reveals demographic and diagnosis patterns among the JIA patients in our region. Assembling an EHR-based JIA cohort will enable longitudinal, de-identified chart review and linkage to the BioVu DNA repository for future studies.
To cite this abstract in AMA style:
Peterson H, Barnado A, Patrick A. Developing Electronic Health Record Algorithms That Accurately Identify Patients with Juvenile Idiopathic Arthritis [abstract]. Arthritis Rheumatol. 2021; 73 (suppl 9). https://acrabstracts.org/abstract/developing-electronic-health-record-algorithms-that-accurately-identify-patients-with-juvenile-idiopathic-arthritis/. Accessed .« Back to ACR Convergence 2021
ACR Meeting Abstracts - https://acrabstracts.org/abstract/developing-electronic-health-record-algorithms-that-accurately-identify-patients-with-juvenile-idiopathic-arthritis/