ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 2306

Improving Predictive Value of Gout Case Definitions in Electric Medical Records Utilizing Natural Language Processing: a Novel Informatics Approach

Sian Yik Lim1, Sara R. Schoenfeld2, Abhishek Chakrabortty3, Tianxi Cai3, Andrew Cagan4, Vivian Gainer5 and Hyon K. Choi6, 1Rheumatology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 2Rheumatology Unit, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 3Department of Biostatistics, Harvard Medical School, Boston, MA, 4Research Computing, Partners HealthCare, Charlestown, MA, 5Partners HealthCare, Boston, MA, 6Rheumatology, Allergy, and Immunology, Massachusetts General Hospital, Harvard Medical School, Boston, MA

Meeting: 2016 ACR/ARHP Annual Meeting

Date of first publication: September 28, 2016

Keywords: Epidemiologic methods and gout

  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
Session Information

Date: Tuesday, November 15, 2016

Title: Metabolic and Crystal Arthropathies - Poster II: Epidemiology and Mechanisms of Disease

Session Type: ACR Poster Session C

Session Time: 9:00AM-11:00AM

Background/Purpose: To date, most of the models used to identify gout cases within large administrative databases have relied solely only on administrative billing codes. The positive predictive value (PPV) of these models ranged from 33-86%. Natural language processing (NLP) is a range of computational techniques for analyzing and representing naturally occurring written or oral text for the purpose of achieving human-like language processing for a range of tasks or applications. In this study we aimed to develop and validate an algorithm that accurately identifies gout patients within the Partners biobank database using both codified data and information from clinical text notes using NLP.

Methods: To create a gold-standard training set, a training set of 200 patients was created. Two rheumatologists reviewed the electric medical records of the 200 patients and classified them as having the disease (Y), probably having the disease (P), not having the disease (N) or unable to make a classification (U). We used the clinician-reviewed classifications to train models to predict the probability of a gout diagnosis or no gout on the basis of a logistic regression classifier with the adaptive least absolute shrinkage and selection operator (LASSO) procedure to select informative variables. We constructed three separate models to predict a diagnosis of gout in our partners biobank cohort- (1) model utilizing number of gout ICD-9 codes alone (ICD-9 model), (2) model comprising all codified variables including disease complications (codified model) (3) a combined model including both codified and NLP variables (combined model).

Results: The area under the curve (AUC) for the combined model was 0.901 (95% CI 0.830-0.972), with a sensitivity of 0.936 at a positive predictive value cut-off of 0.902. The AUC of the ICD-9 model was 0.721 (95% CI 0.617-0.825), while that of the codified model was 0.879 (95% CI 0.806-0.952). Addition of NLP narrative terms to our final model resulted in improving the sensitivity to 0.936 from 0.89, at the same PPV level of 0.902, thus resulting in improved identification of gout cases by 4.12%, compared to the codified model. On review of medical records from an additional random set of 50 patients each predicted to have gout by the combined model, 44 were correctly identified as having this diagnosis through chart review resulting in a positive predictive value of 88%.

Conclusion: Including narrative concepts from natural language processing improves the accuracy of EMR case-definition for gout while simultaneously identifying more subjects compared to models using codified data alone.


Disclosure: S. Y. Lim, None; S. R. Schoenfeld, None; A. Chakrabortty, None; T. Cai, None; A. Cagan, None; V. Gainer, None; H. K. Choi, None.

To cite this abstract in AMA style:

Lim SY, Schoenfeld SR, Chakrabortty A, Cai T, Cagan A, Gainer V, Choi HK. Improving Predictive Value of Gout Case Definitions in Electric Medical Records Utilizing Natural Language Processing: a Novel Informatics Approach [abstract]. Arthritis Rheumatol. 2016; 68 (suppl 10). https://acrabstracts.org/abstract/improving-predictive-value-of-gout-case-definitions-in-electric-medical-records-utilizing-natural-language-processing-a-novel-informatics-approach/. Accessed .
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print

« Back to 2016 ACR/ARHP Annual Meeting

ACR Meeting Abstracts - https://acrabstracts.org/abstract/improving-predictive-value-of-gout-case-definitions-in-electric-medical-records-utilizing-natural-language-processing-a-novel-informatics-approach/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM ET on November 14, 2024. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology