Session Information
Session Type: ACR Poster Session A
Session Time: 9:00AM-11:00AM
Background/Purpose:
Electronic medical records (EMRs) are increasingly being utilized for clinical research, where the phenotypes of interest are typically defined by algorithms. Almost a decade ago, a rheumatoid arthritis (RA) phenotype algorithm using codified and narrative data extracted from the EMR using natural language processing (NLP), was developed using machine learning approaches. The objective of this study was to evaluate the temporal portability of this algorithm, with the introduction of International Classification of Diseases (ICD), 10th revision codes, as well as a new EMR system (Epic) at our institution.
Methods:
We studied subjects from the EMR of 2 large academic centers with ≥ 1 ICD9 RA code (714.x) or ICD10 RA code (M05.x, M06.x) and ≥ 2 clinical notes to create a database of all potential patients with RA (“RA Mart”, n = 52,728). A random 100 subjects were selected from the RA Mart, and patients were classified as RA yes/no from medical record review to create the validation set. We first calculated the performance characteristics of using ≥2 RA ICD9 or ICD10 RA codes to define RA compared to RA classified from chart review. We then applied a previously published logistic regression algorithm for RA using ICD9 codes and data extracted using NLP from data fields specified in 2010. For example, this model would not include treatments approved after 2010. We then applied a modified algorithm incorporating ICD10 codes and additional medications to existing variable fields, e.g. number of RA ICD9 codes became number of ICD 9 or 10 RA codes. We compared performance characteristics of the original 2010 with the modified 2010 RA algorithm using the original published positive predictive value (PPV) as a benchmark.
Results:
In the validation set, 41% of subjects were classified as RA. Among those with RA, mean age was 68, 76% female, and 59% were RF or anti-CCP positive; 7% of subjects only had ICD10 but not ICD9 codes. The PPV for classifying RA using ≥2 ICD9 codes was 50%; and for using ≥2 ICD9 or ICD10 was 52% (Table). Using the exact data fields specified in the 2010 algorithm, we achieved a PPV of 93%. When the data fields were updated with new types of data, ICD10, new treatments, the PPV remained at 93%. In comparison, the published PPV of the algorithm was 94%, with a sensitivity of 63%.
Conclusion:
We observed that an existing RA algorithm trained using machine learning approaches on EMR data was robust temporally, despite the introduction of new medical information which also updated the algorithm steps. At this time including ICD10 had a minimal impact on classification. The existing RA algorithm continued to perform significantly better than using ICD9 or ICD10 data alone at classifying RA.
Table. Performance characteristic of the published algorithm and modified algorithm to identify individuals with RA, as compared to codified data alone (n=100).
|
≥2 ICD9 RA codes
|
≥2 ICD9 or ICD10 RA codes |
Published algorithm |
Modified algorithm |
Sensitivity |
0.80 |
0.93 |
0.68 |
0.66 |
Specificity |
0.44 |
0.41 |
0.97 |
0.97 |
PPV |
0.50 |
0.52 |
0.93 |
0.93 |
RA: rheumatoid arthritis
PPV: positive predictive value
To cite this abstract in AMA style:
Huang S, Huang J, Cai T, Dahal KP, Cagan A, Stratton J, Cai T, Liao KP. Impact of International Classification of Diseases 10th Revision Codes and Updated Medical Information on an Existing Rheumatoid Arthritis Phenotype Algorithm Using Electronic Medical Data [abstract]. Arthritis Rheumatol. 2018; 70 (suppl 9). https://acrabstracts.org/abstract/impact-of-international-classification-of-diseases-10th-revision-codes-and-updated-medical-information-on-an-existing-rheumatoid-arthritis-phenotype-algorithm-using-electronic-medical-data/. Accessed .« Back to 2018 ACR/ARHP Annual Meeting
ACR Meeting Abstracts - https://acrabstracts.org/abstract/impact-of-international-classification-of-diseases-10th-revision-codes-and-updated-medical-information-on-an-existing-rheumatoid-arthritis-phenotype-algorithm-using-electronic-medical-data/