ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 2735

Identifying ANCA-Associated Vasculitis Cases in Electronic Health Records Using Natural Language Processing

Zachary Wallace1, John H. Stone2 and Hyon K. Choi3, 1Division of Rheumatology, Allergy and Immunology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, 2Rheumatology (Medicine), Massachusetts General Hospital, Harvard Medical School, Boston, MA, 3Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital, Boston, MA

Meeting: 2018 ACR/ARHP Annual Meeting

Keywords: ANCA, Electronic Health Record and vasculitis

  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
Session Information

Date: Tuesday, October 23, 2018

Title: Vasculitis – ANCA-Associated Poster II

Session Type: ACR Poster Session C

Session Time: 9:00AM-11:00AM

Background/Purpose: Epidemiologic studies of ANCA-associated vasculitis (AAV) using large data sets are often limited by the lack of validated definitions of AAV cases that can be applied on a large scale. A prior study developed algorithms using billing codes, prescription records, and ANCA pattern (not antigen specificity) to classify patients into traditional clinical phenotypes (e.g., granulomatosis with polyangiitis, GPA) with PPV ranging from 81% to 100%. We sought to determine whether a user-friendly natural language processing (NLP) tool could improve the performance of AAV case-finding algorithms in an electronic health record (EHR) database.

Methods: Using EHR data on 2 million patients from a large, multi-center healthcare system that includes Massachusetts General Hospital (MGH) and Brigham and Women’s Hospital (BWH), we evaluated the performance of algorithms that incorporated billing codes, ANCA antigen specificity test results, and/or NLP to identify patients with AAV. Unstructured data (e.g., pathology reports, clinical notes) were searched using NLP for key words and phrases suggestive of AAV. The NLP program eliminates reports where the search phrase is near a term that may negate a diagnosis of AAV (e.g., “the patient does not have ANCA-associated vasculitis”). To assess the performance (Positive Predictive Value, PPV) of each algorithm, a cohort of patients with and without AAV was identified from a population of 35,623 patients. We then evaluated the performance of each algorithm in randomly assembled cohorts of patients evaluated in rheumatology and nephrology clinics.

Results: The general AAV cohort used for primary validation was established from the entire population and included 207 patients, the majority of whom had AAV (N=161, 78%). This cohort included 25 patients (12.1%) with positive ANCA test results but without AAV. An algorithm solely using billing codes had a PPV of 79% (73%-84%), 18% (5%-40%), and 4% (0%-14%) for identifying cases of AAV in the entire EHR, a rheumatology clinic cohort, and nephrology clinic cohort, respectively (Table 1). An algorithm that required an NLP reference to AAV, a billing code associated with AAV, and a positive PR3- or MPO-ANCA test result led to a PPV of 95% (88%-98%), 100%, and 100%, respectively.

Conclusion: In our study, the use of NLP substantially improved the PPV of algorithms meant to identify cases of AAV. In the context of increasingly large data sources that include both structured (e.g., billing codes, test results) and unstructured data (e.g., clinical notes), NLP can improve the ability to accurately (PPV > 90%) classify patients with AAV. Furthermore, as ANCA type is increasingly viewed as a superior approach to differentiating AAV subtypes compared with clinical phenotypes (e.g., GPA), an algorithm such as ours that incorporate ANCA types can be useful for future epidemiologic studies in AAV using EHRs.

Table 1: Algorithm Performance in 207 Patients Selected based on ICD-9 Codes for ANCA-Associated Vasculitis

Total Possible AAV Cases Identified by Algorithm in EHR

Positive Predictive Value (95% CI)

1. ICD-9 code

20,557

79% (73%-84%)

2. ICD-9 and ANCA-positive

1,951

88% (82%-92%)

3. NLP and ANCA-positive

898

92% (87%-98%)

4. NLP or ICD-9 and ANCA-positive

2,065

87% (80%-91%)

5. NLP and ICD-9 and ANCA-positive

775

95% (88%-98%)


Disclosure: Z. Wallace, None; J. H. Stone, Roche, 2,Roche, 5; H. K. Choi, Takeda, Selecta, Kowa, and Horizon, 5,Selecta and Horizon, 2.

To cite this abstract in AMA style:

Wallace Z, Stone JH, Choi HK. Identifying ANCA-Associated Vasculitis Cases in Electronic Health Records Using Natural Language Processing [abstract]. Arthritis Rheumatol. 2018; 70 (suppl 9). https://acrabstracts.org/abstract/identifying-anca-associated-vasculitis-cases-in-electronic-health-records-using-natural-language-processing/. Accessed .
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print

« Back to 2018 ACR/ARHP Annual Meeting

ACR Meeting Abstracts - https://acrabstracts.org/abstract/identifying-anca-associated-vasculitis-cases-in-electronic-health-records-using-natural-language-processing/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM ET on November 14, 2024. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology