Session Information
Session Type: Abstract Submissions (ACR)
Background/Purpose:
Gout flares are the most common manifestation of gout. Gout flares are not well documented by diagnosis codes and are not adequately captured when conducting retrospective database analyses. Recently, in 2009, a code for acute gouty flare (ICD9 – 274.01) became available, but has yet to be widely adopted. We implemented a computer based method to automatically identify gout flares using Natural Language Processing (NLP) and Machine Learning (ML). NLP aims to understand human languages and facilitates the utilization of the information in the free text. Machine Learning is aimed to train a computer system from data and then used to make decisions on new data.
Methods:
A retrospective review was conducted between 1/1/2007 to 12/31/2010 from Kaiser Permanente Southern California Region. Patients > 18 years, with a diagnosis of gout (ICD9 274.xx) and on urate-lowering therapy were identified. 599,317 notes for 16,519 patients were retrieved. A training dataset of 1,264 notes was created by selecting 100 random patients. A Rheumatologist reviewed these notes to classify each note as gout flare or not. 1,192 notes from another 100 random selected patients were created and independently reviewed by two Rheumatologists to create the gold standard. A list of key words and phrases were used in the NLP algorithm which was used to capture different aspects of gout flares. The NLP results were used as features of the ML system which helped to achieve better specificity without significant loss in sensitivity. Both NLP+ML algorithms were developed using the training dataset then applied to all of the notes. Gout flares were also identified using claims data as proposed in published literature, which was then compared to the gold standard as well.
Results:
Out of the 599,317 notes, the NLP system identified 49,415 notes as gout flare. ML system further classified them into 18,869 positive and 30,546 negative cases. Flares occurred within 30 days are merged. For the 16,519 patients, the NLP+ML system identified 1,402 patients with >= 3 flares, 5,954 with 1-2 flares, and 9,163 with no flare. Our method significantly identified more flare cases (18,869 vs. 7,861) and patients (7,356 vs. 5,458) compared to the method using the claims data.
On task of identifying flare for each note, our method achieved sensitivity of 84.8% and specificity of 92.2%. On task of identifying patients who had flares, our method had sensitivity of 98.5% and specificity of 96.4%. On task of identifying patients who had 3 or more flares, our method had sensitivity of 93.5% and specificity of 84.6%. The NLP+ML method is consistently better than the claims data approach and it even out-performed the two rheumatologist reviewers on task of identifying patients with gout flares.
Conclusion:
A combination of NLP and ML is able to accurately and efficiently identify patients with gout flares. This is the first successful use of a computer controlled algorithm to identify gout flares from the clinical notes. It demonstrates that NLP could be a valuable tool to identify populations of patients who experience gout flares for more intensive therapy, ultimately getting them to serum urate goal, and help reducing overall costs associated to gout flares.
Disclosure:
C. Zheng,
None;
N. Rashid,
None;
T. C. Cheetham,
None;
Y. L. Wu,
None;
G. D. Levy,
None.
« Back to 2013 ACR/ARHP Annual Meeting
ACR Meeting Abstracts - https://acrabstracts.org/abstract/using-natural-language-processing-and-machine-learning-to-identify-gout-flares-from-electronic-clinical-notes/