ACR Meeting Abstracts

ACR Meeting Abstracts

  • Meetings
    • ACR Convergence 2024
    • ACR Convergence 2023
    • 2023 ACR/ARP PRSYM
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • ACR Meetings

Abstract Number: 1894

Natural Language Processing Tool for Extraction of Patient-Reported Outcomes from a National Multi-Electronic Health Records Registry

Marie Humbert-Droz1, Zara Izadi2, Gabriela Schmajuk2, Milena Gianfrancesco2, Jinoos Yazdany2 and Suzanne Tamang3, 1Stanford University, Stanford, 2University of California San Francisco, San Francisco, CA, 3Stanford Center for Population Health Sciences, Redwood City, CA

Meeting: ACR Convergence 2021

Keywords: informatics, Patient reported outcomes, quality of care, quality of life

  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print
Session Information

Date: Tuesday, November 9, 2021

Title: Abstracts: Measures & Measurement of Healthcare Quality (1893–1896)

Session Type: Abstract Session

Session Time: 10:45AM-11:00AM

Background/Purpose: Patient reported outcomes (PROs) are increasingly used to track disease activity and facilitate shared decision making in patients with RA. Assessments of disease activity (DA) and functional status (FS) PROs during routine clinical care are recommended in national RA guidelines. However, many rheumatologists do not have support from health IT to reconfigure their EHR systems to collect PROs as structured data. We developed and evaluated a natural language processing (NLP) pipeline for extracting DA and FS scores from clinical notes within the ACR’s Rheumatology Informatics System for Effectiveness (RISE) registry.

Methods: We examined de-identified notes and structured electronic health record (EHR) data from all patients with a confirmed diagnosis of RA (2 ICD codes at least 30 days apart), from January 1, 2015, to December 30, 2018 in the RISE registry. The NLP tool was developed in a stepwise approach to extract scores corresponding to Clinical Disease Activity Index (CDAI), Routine Assessment of Patient Index Data 3 (RAPID3), Multidimensional Health Assessment Questionnaire (MDHAQ), and HAQ (Figure 1). First, in a text pre-processing step, we harmonized the notes’ format. Next, the concepts of interest (PRO instruments and scores) were annotated. A post-processing step involved formatting and score resolution. The performance of the NLP pipeline was evaluated against a gold standard of human chart review of 100 PRO mentions within 48 randomly-selected notes. We calculated an inter-rater agreement between the NLP-extracted scores and structured scores where available. Agreement was calculated according to (1) “exact” matching based on the numerical scores and (2) for DA scores, “fuzzy” matching, based on score categories (remission, low, etc).

Results: Over 34 million notes from 854,628 patients, from 158 practices, and 24 EHR systems were processed through the NLP pipeline. The majority of practices (n=134) had structured data available for comparison. Overall, our system achieved good fidelity for PRO instrument and score extraction, resulting in a sensitivity of 93.2%, specificity of 80.5% and positive predictive value of 87.3%. DA measures (CDAI and RAPID3) showed substantial agreement between notes and structured data; FS measures (MDHAQ and HAQ) showed almost perfect agreement (Table 1).

Conclusion: The developed NLP pipeline demonstrated good performance, was able to extract PROs from clinical notes of practices in the absence of structured data and can potentially facilitate reporting of quality and performance measures for outpatient rheumatology practices. Further studies are needed to evaluate the potential generalizability of the NLP pipeline to other types of PRO instruments, and to determine whether NLP performance varies by EHR, practice or note type.

pipeline_v3.jpeg”NLP pipeline

kappa_table.jpeg”Inter-rater agreement scores between the NLP extractions and the structured data obtained from RISE


Disclosures: M. Humbert-Droz, None; Z. Izadi, None; G. Schmajuk, None; M. Gianfrancesco, None; J. Yazdany, Astra Zeneca, 2, 5, Pfizer, 2, 6, Gilead, 5, BMS Foundation, 5; S. Tamang, None.

To cite this abstract in AMA style:

Humbert-Droz M, Izadi Z, Schmajuk G, Gianfrancesco M, Yazdany J, Tamang S. Natural Language Processing Tool for Extraction of Patient-Reported Outcomes from a National Multi-Electronic Health Records Registry [abstract]. Arthritis Rheumatol. 2021; 73 (suppl 9). https://acrabstracts.org/abstract/natural-language-processing-tool-for-extraction-of-patient-reported-outcomes-from-a-national-multi-electronic-health-records-registry/. Accessed .
  • Tweet
  • Click to email a link to a friend (Opens in new window) Email
  • Click to print (Opens in new window) Print

« Back to ACR Convergence 2021

ACR Meeting Abstracts - https://acrabstracts.org/abstract/natural-language-processing-tool-for-extraction-of-patient-reported-outcomes-from-a-national-multi-electronic-health-records-registry/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

All abstracts accepted to ACR Convergence are under media embargo once the ACR has notified presenters of their abstract’s acceptance. They may be presented at other meetings or published as manuscripts after this time but should not be discussed in non-scholarly venues or outlets. The following embargo policies are strictly enforced by the ACR.

Accepted abstracts are made available to the public online in advance of the meeting and are published in a special online supplement of our scientific journal, Arthritis & Rheumatology. Information contained in those abstracts may not be released until the abstracts appear online. In an exception to the media embargo, academic institutions, private organizations, and companies with products whose value may be influenced by information contained in an abstract may issue a press release to coincide with the availability of an ACR abstract on the ACR website. However, the ACR continues to require that information that goes beyond that contained in the abstract (e.g., discussion of the abstract done as part of editorial news coverage) is under media embargo until 10:00 AM ET on November 14, 2024. Journalists with access to embargoed information cannot release articles or editorial news coverage before this time. Editorial news coverage is considered original articles/videos developed by employed journalists to report facts, commentary, and subject matter expert quotes in a narrative form using a variety of sources (e.g., research, announcements, press releases, events, etc.).

Violation of this policy may result in the abstract being withdrawn from the meeting and other measures deemed appropriate. Authors are responsible for notifying colleagues, institutions, communications firms, and all other stakeholders related to the development or promotion of the abstract about this policy. If you have questions about the ACR abstract embargo policy, please contact ACR abstracts staff at [email protected].

Wiley

  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences

© Copyright 2025 American College of Rheumatology