ACR Meeting Abstracts

ACR Meeting Abstracts

  • Home
  • Meetings Archive
    • ACR Convergence 2022
    • ACR Convergence 2021
    • ACR Convergence 2020
    • 2020 ACR/ARP PRSYM
    • 2019 ACR/ARP Annual Meeting
    • 2018 ACR/ARHP Annual Meeting
    • 2017-2009 Meetings
    • Download Abstracts
  • Keyword Index
  • Advanced Search
  • Your Favorites
    • Favorites
    • Login
    • View and print all favorites
    • Clear all your favorites
  • Meeting Resource Center

Abstract Number: 1894

Natural Language Processing Tool for Extraction of Patient-Reported Outcomes from a National Multi-Electronic Health Records Registry

Marie Humbert-Droz1, Zara Izadi2, Gabriela Schmajuk2, Milena Gianfrancesco2, Jinoos Yazdany2 and Suzanne Tamang3, 1Stanford University, Stanford, 2University of California San Francisco, San Francisco, CA, 3Stanford Center for Population Health Sciences, Redwood City, CA

Meeting: ACR Convergence 2021

Keywords: informatics, Patient reported outcomes, quality of care, quality of life

  • Tweet
  • Email
  • Print
Session Information

Date: Tuesday, November 9, 2021

Session Title: Abstracts: Measures & Measurement of Healthcare Quality (1893–1896)

Session Type: Abstract Session

Session Time: 10:45AM-11:00AM

Background/Purpose: Patient reported outcomes (PROs) are increasingly used to track disease activity and facilitate shared decision making in patients with RA. Assessments of disease activity (DA) and functional status (FS) PROs during routine clinical care are recommended in national RA guidelines. However, many rheumatologists do not have support from health IT to reconfigure their EHR systems to collect PROs as structured data. We developed and evaluated a natural language processing (NLP) pipeline for extracting DA and FS scores from clinical notes within the ACR’s Rheumatology Informatics System for Effectiveness (RISE) registry.

Methods: We examined de-identified notes and structured electronic health record (EHR) data from all patients with a confirmed diagnosis of RA (2 ICD codes at least 30 days apart), from January 1, 2015, to December 30, 2018 in the RISE registry. The NLP tool was developed in a stepwise approach to extract scores corresponding to Clinical Disease Activity Index (CDAI), Routine Assessment of Patient Index Data 3 (RAPID3), Multidimensional Health Assessment Questionnaire (MDHAQ), and HAQ (Figure 1). First, in a text pre-processing step, we harmonized the notes’ format. Next, the concepts of interest (PRO instruments and scores) were annotated. A post-processing step involved formatting and score resolution. The performance of the NLP pipeline was evaluated against a gold standard of human chart review of 100 PRO mentions within 48 randomly-selected notes. We calculated an inter-rater agreement between the NLP-extracted scores and structured scores where available. Agreement was calculated according to (1) “exact” matching based on the numerical scores and (2) for DA scores, “fuzzy” matching, based on score categories (remission, low, etc).

Results: Over 34 million notes from 854,628 patients, from 158 practices, and 24 EHR systems were processed through the NLP pipeline. The majority of practices (n=134) had structured data available for comparison. Overall, our system achieved good fidelity for PRO instrument and score extraction, resulting in a sensitivity of 93.2%, specificity of 80.5% and positive predictive value of 87.3%. DA measures (CDAI and RAPID3) showed substantial agreement between notes and structured data; FS measures (MDHAQ and HAQ) showed almost perfect agreement (Table 1).

Conclusion: The developed NLP pipeline demonstrated good performance, was able to extract PROs from clinical notes of practices in the absence of structured data and can potentially facilitate reporting of quality and performance measures for outpatient rheumatology practices. Further studies are needed to evaluate the potential generalizability of the NLP pipeline to other types of PRO instruments, and to determine whether NLP performance varies by EHR, practice or note type.

pipeline_v3.jpeg”NLP pipeline

kappa_table.jpeg”Inter-rater agreement scores between the NLP extractions and the structured data obtained from RISE


Disclosures: M. Humbert-Droz, None; Z. Izadi, None; G. Schmajuk, None; M. Gianfrancesco, None; J. Yazdany, Astra Zeneca, 2, 5, Pfizer, 2, 6, Gilead, 5, BMS Foundation, 5; S. Tamang, None.

To cite this abstract in AMA style:

Humbert-Droz M, Izadi Z, Schmajuk G, Gianfrancesco M, Yazdany J, Tamang S. Natural Language Processing Tool for Extraction of Patient-Reported Outcomes from a National Multi-Electronic Health Records Registry [abstract]. Arthritis Rheumatol. 2021; 73 (suppl 9). https://acrabstracts.org/abstract/natural-language-processing-tool-for-extraction-of-patient-reported-outcomes-from-a-national-multi-electronic-health-records-registry/. Accessed February 5, 2023.
  • Tweet
  • Email
  • Print

« Back to ACR Convergence 2021

ACR Meeting Abstracts - https://acrabstracts.org/abstract/natural-language-processing-tool-for-extraction-of-patient-reported-outcomes-from-a-national-multi-electronic-health-records-registry/

Advanced Search

Your Favorites

You can save and print a list of your favorite abstracts during your browser session by clicking the “Favorite” button at the bottom of any abstract. View your favorites »

ACR Pediatric Rheumatology Symposium 2020

© COPYRIGHT 2023 AMERICAN COLLEGE OF RHEUMATOLOGY

Wiley

  • Home
  • Meetings Archive
  • Advanced Search
  • Meeting Resource Center
  • Online Journal
  • Privacy Policy
  • Permissions Policies
  • Cookie Preferences